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Epstein Barr Virus Induced Genes 



STATEMENT OF GOVERNMENT RIGHTS IN THE INVENTION 

Part of the work performed during development of this invention 
utilized U.S. Government funds. The U.S. Government has certain rights in 
this invention. 

BACKGROUND OF THE INVENTION 

Field of the Invention 

The present invention relates, in general, to Epstein Barr virus induced 
(EBI) genes. In particular, the present invention relates to DNA segments 
coding for EBI 1, EBI 2, or EBI 3 polypeptides; EBI 1, EBI 2, or EBI 3 
polypeptides; recombinant DNA molecules; cells containing the recombinant 
DNA molecules; antisense EBI 1, EBI 2, or EBI 3 constructs; antibodies 
having binding affinity to an EBI 1, EBI 2, or EBI 3 polypeptide; hybridomas 
containing the antibodies; nucleic acid probes for the detection of the presence 
of Epstein Barr Virus; a method of detecting Epstein Barr virus in a sample; 
and kits containing nucleic acid probes or antibodies. 

Background Information 

Epstein-Barr Virus (EBV) is the cause of infectious mononucleosis, a 
benign proliferation of infected B lymphocytes (Henle, G., et al. t Proc. Natl. 
Acad. ScL USA 59^:94-101 (1968)) and can also cause acute and rapidly 
progressive B lymphoproliferative disease in severely immune compromised 
patients or in experimental infection of tamarins (Miller, G. , Fields Virol , 2nd 
ed., 1921-58 (1990)). Infection of human B lymphocytes, in vitro, results in 
expression of six virus encoded nuclear proteins (EBNAs) and two virus 
encoded membrane proteins (LMPs) (Kieff and Ltebowitz, Fields Virol, 2nd 
ed., 1889-1920 (1990)), and in substantially altered cell growth (Nilsson and 



Klein, Adv. Cancer Res. 37(319):319-W (1982)). EBV infected B 
lymphocytes recapitulate features of antigen stimulation in enlarging, 
increasing RNA synthesis, expressing activation antigens and adhesion 
molecules, secreting Ig and proliferating (Boyd, A.W., etal., J. Immunol. 
134(3):\5\6-23 (1985); Gordon, J., etal.. Immunology 58(4):591-5 (1986); 
Guy and Gordon, Intl. J. Cancer 43(4):703-Z (1989); Nilsson and Kldn,Adv. 
Cancer Res. 37(319):319-S0 (1982); Thorley-Lawson, D.A., etal., 
J. Immunol. 134(5):3Wl-\2 (1985)). Unlike antigen stimulated B 
lymphocytes, EBV infected B lymphocytes continue to proliferate in vitro as 
immortalized lymphoblastoid cell lines (LCLs) (Nilsson, K., etal., Intl. J. 
Cancer 8(3)-M3-50 (1971)). 

EBV effects on lymphocytes have been studied by comparing the 
properties of EBV-negative [EBV(-)] Burkitt lymphoma (BL) cell lines and 
EBV-positive [EBV(+)] derivatives, infected by EBV, in vitro (Calender, A., 
etal., Proc. Natl. Acad. Sci. USA 84(22) :8060-4 (1987); Ehlin-Henriksson, 
B., et al.,Intl. J. Cancer 39(2):2U-8 (1987); Nilsson and Klein, Adv. Cancer 
Res. 37(319):319-B0 (1982); Rowe, M., et al., Intl. J. Cancer 37(3):361 -13 
(1986)). EBV(-) BL cells resemble proliferating centroblasts of germinal 
centers, characteristically expressing CD10, CD20, CD77 (BLA), class II 
antigen, and the carbohydrate recognized by peanut agglutinin (Calender, A., 
etal., Proc. Natl. Acad. Sci. USA 54(22): 8060-4 (1987); Ehlin-Henriksson, 
B., etal., Intl. J. Cancer 590:211-8 (1987); Favrot, M.C., etal., Intl. J. 
Cancer 38(6):901-6 (1986); Gregory, CD., et al., Intl. J. Cancer 42 (2):213- 
20 (1988); Gregory, CD., etal., J. Gen. Virol. 77:1481-1495 (1990); 
Gregory, CD., etal., J. Immunol. I39(l):313-B (1987); Rowe, M., etal., 
Intl. J. Cancer 37(3):361-13 (1986); Rowe, M., etal., Intl. J. Cancer 
55^:435-41 (1985)). Both EBV(-) BL cells and centroblasts lack surface IgD 
and antigens associated with early phases of mitogen stimulation in vitro, 
including CD23, CD39 and CD30. In general, EBV(+) BL cells closely 
resemble EBV infected primary B lymphocytes in not expressing CD10 or 
CD77 and in expressing early activation and differentiation markers, vimentin, 
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Bac-1, Bcl-2, surface IgD and CD44 (Calender, A., et al., Proc. Natl. Acad. 
Sci. USA 84(22):S060-4 (1987); Ehlin-Henriksson, B., et al., Intl. J. Cancer 
39(2):2ll-S (1987); Favrot, M.C., et al., Intl. J. Cancer 38(6):90l-6 (1986); 
Gregory, CD., etal., J. Gen. Virol. 7i:1481-1495 (1990); Henderson, S., 

5 etal., Cell 65(7): 1 107-15 (1991); Rowe, M.,etal., Intl. J. Cancer 37(3):361- 

73 (1986); Rowe, M., et al.,EMBOJ. 6(9j:2743-51 (1987); Spira, G., et al., 
J. Immunol. 126(1): 122-6 (1981); Suzuki, T. , etal., J. Immunol: 137(4): 1208- 
13 (1986)). Experiments with single gene transfer into EBV(-) B lymphoma 
cells, or with specifically mutated EBV recombinants reveal that EBNA 2, 

10 LMP 1 and EBNA 3C are essential for lymphocyte growth transformation and 

alter cellular or viral gene expression. Expression of EBNA 2 alone in 
EBV(-) BL cell lines results in enhanced transcription of CD23, CD21 
(Cordier, M., etal., J. Virol. (54(5): 1002-13 (1990); Wang, F., etal., J. 
Virol. 64(5):23Q9-1Z (1990); Wang, F., etal., Proc. Natl. Acad. Sci. USA 

15 84(10):3452-6 (1987)), and c-fgr (Knutson, J.C., J. Virol. 64(6):2530-6 

(1990)). EBNA 2 also transactivates the LMP promoters (Fahraeus, R., 
et al., Proc. Natl. Acad. Sci. USA 87(19):7390-4 (1990); Wang, F., et al., J. 
Virol. 64(7):3407-l6 (1990)). Analysis of a series of EBNA 2 mutants 
indicates that the ability of EBNA 2 to transactivate gene expression is tightly 

20 linked to its essential role in cell growth transformation (Cohen, J.I., et al., 

J. Virol. 65(5^:2545-54 (1991)). LMP 1 is also critical to EBV's effects on 
cell growth. LMP 1 transforms immortalized rodent fibroblasts (Baichwal and 
Sugden, Oncogene 2(5J:461-7 (1988); Wang, D., etal., Cell #:831-40 
(1985)) and induces vimentin, Bcl-2 and many of the activation markers and 

25 adhesion molecules that EBV induces in BL cells (Birkenbach, M. , etal., J. 

Virol. 63(9)'AVI9-S4 (1989); Henderson, S., et al., Cell65(7):ll0f7-15 (1991); 
Wang, D., etal., J. Virol. 62(11)'A173-M (1988)). In EBV(-) BL cells, 
EBNA 3c induces higher level expression of CD21 (Wang, F., etal., J. Virol. 
64(5):2309-lB (1990)). 

30 Since altered B lymphocyte gene expression is a central theme in EBV 

induced changes in B lymphocyte growth, a more complete description of the 



repertoire of EBV induced genes would be advantageous prior to the 
investigation of specific genes for their role as mediators of EBV effects on 
cell growth. Also, because of the similar effects of EBV and antigen, EBV 
induced genes are likely to include mediators of antigen induced B lymphocyte 
growth or differentiation. Previously, recognition of such genes has been 
largely based on increased expression of lymphocyte surface markers 
(Calender, A., etal. y Proc. Natl Acad. Sd. USA 84(22J:806O4 (1987)), 
defined by monoclonal antibodies derived against EBV or antigen activated B 
lymphocytes. Few of these surface markers are likely candidates for important 
effectors of EBV or antigen induced alterations in lymphocyte growth. The 
experiments described here use subtractive hybridization to identify cDNA 
clones of RNAs which are more abundant in an in vitro infected EBV(+) BL 
cell than in the non-infected EBV(-) control BL cell. 

SUMMARY OF THE INVENTION 

It is a general object of this invention to provide EBI 1, EBI 2, and 
EBI 3 DNA segments. It is a specific object of this invention to provide a 
DNA segment coding for a polypeptide having an amino acid sequence 
corresponding to an EBI 1, EBI 2, or EBI 3 polypeptide. 

It is another object of the invention to provide a substantially pure 
polypeptide having an amino acid sequence corresponding to an EBI 1, EBI 
2, or EBI 3 polypeptide. 

It is a further object of the invention to provide a nucleic acid probe 
for the detection of the presence of Epstein Barr Virus in a sample. 

It is another object of the invention to provide a method of detecting 
Epstein Barr Virus in a sample. 

It is a further object of the invention to provide a kit for identifying or 
amplifying a gene encoding an EBI 1, EBI 2, or EBI 3 polypeptide. 



It is another object of the invention to provide a DNA molecule 
comprising, 5' to 3', a promoter effective to initiate transcription in a cell and 
an EBI 1, EBI 2, or EBI 3 DNA segment 

It is a further object of the invention to provide a recombinant DNA 
molecule comprising a vector and an EBI 1, EBI 2, or EBI 3 DNA segment. 

It is a further object of the invention to provide a DNA molecule 
comprising a transcriptional region functional in a cell/ a sequence 
complimentary to an RNA sequence encoding an amino acid sequence 
corresponding to an EBI 1, EBI 2, or EBI 3 polypeptide, and a transcriptional 
termination region functional in said cell. 

It is another object of the invention to provide cells containing the 
above-described DNA molecules. 

It is a further object of the invention to provide an antibody having 
binding affinity to an EBI 1, EBI 2, or EBI 3 polypeptide, or a binding 
fragment thereof. 

It is another object of the invention to provide a hybridoma which 
produces the above-described antibody, or binding fragment thereof. 

It is a further object of the invention to provide a method of detecting 
an EBI 1, EBI 2, or EBI 3 polypeptide in a sample. 

It is another object of the invention to provide a diagnostic kit 
comprising EBI 1, EBI 2, or EBI 3 antibodies. 

Further objects and advantages of the present invention will be clear 
from the description that follows. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1. EBV induced gene (EBI) Land 2 RNA: Nucleotide and 
deduced amino acid sequences. (A) EBI 1 has two potential translational 
initiation codons. In frame stop codons are indicated asterisks (*). A 
hydrophobic amino terminal segment (single underline) is predicted to be a 
signal peptide for membrane translocation. Seven other highly hydrophobic 



segments are predicted to form membrane spanning domains and are 
delineated by double underlines. Potential asparagine linked glycosylation 
sites (CHCW#####) are present in the extracellular amino terminal segment and 
third extracellular loop. The sequence motif S-(IAO-D-R-(Y/F)-X-X-X-X 
(where X represents consecutive hydrophobic residues), is highly conserved 
among a large number of G-protein coupled receptors and is indicated at the 
end of the third transmembrane domain (::::). (B) EBI 2 has 2 possible 
initiator methionine codons. Predicted transmembrane domains are indicated 
(double underlines). No signal peptide sequence was identified. The amino 
terminal extracellular segment contains a potential N-linked glycosylation site 
(CHO######). 

Figure 2. RNA blot hybridization analysis of EBV induced cellular 
gene expression. Polyadenylylated (4 to 12 jig per lane) was size fractionated 
on formaldehyde agarose gels, transferred to charged nylon membranes, and 
hybridized with the probes indicated at the bottom of each autoradiograph 
panel. RNA samples used are indicated at the top of each lane (LCLrEBV 
immortalized primary B lymphoblastoid cell line, IB4; BL:EBV negative 
Burkitt lymphoma cell line, BUI; EBL:EBV infected Burkitt lymphoma cell 
line, BL41/B95-8, derived by in vitro infection of BL41 line). Dashes indicate 
positions of ribosomal RNA bands (18s, 28s). The band detected at 1.5 kb in 
the LCL lane by the P68 probe is due to residual signal from a prior 
hybridization. 

Figure 3. Expression of EBI 1 and EBI 2 receptor genes in human 
lymphoid tissues and cell lines. 32P-labelled probes indicated at the left of 
each panel were hybridized to blots containing RNA from the cell lines 
indicated at the top of each lane. BL41 and BL30 are EBV-negative BL cell 
lines; BL41/P3HR1 is infected with a non-transforming EBV strain, p3HRl; 
BL41/B95-8 is infected with a transforming EBV strain; IB4 is a cell line 
derived by infecting primary B lymphocytes with EBV of the B95-8 strain; 
LCLAV91 is a recently established cell line transformed with EBV strain W91 ; 
TONSIL is unfractionated cells from surgically excised human tonsil; PBMC 



is unfractionated peripheral blood mononuclear cells; PBMC PWM is PBMC 
stimulated 72 h with pokeweed mitogen (2.5 /ig/ral); PBT PHA is T cells 
purified from PBMC by sheep erythrocyte resetting, stimulated 72 h with 
phytohemagglutinin (1 /xg/ml); B MARR is post-mortem bone marrow; 
SPLEEN is unfractionated cells from surgically excised spleen; HL60 is a 
promyelocyte leukemia cell line; U937 is a monocytic leukemia cell , line; 
K562 is a chronic myelogenous leukemia cell line; JURKAT is a T cell 
leukemia cell line; HSB-2 is a T cell acute lymphoblastic leukemia cell line; 
RHEK-1 is an adenovirus/SV40 transformed human keratinocyte; TK143 is a 
osteosarcoma cell line. Each panel is a composite prepared from 
autoradiographs of two separate blots for each probe. 

Figure 4. EBI 1 and EBI 2 gene expression in human tissues. EBI 1, 
EBI 2 and immunoglobulin mu chain (IgU) probes were hybridized to RNA 
samples from the following human tissues: heart (HE), brain (BR), placenta 
(PL), lung (LU), liver (LI), kidney (KI), skeletal muscle (SM) and pancreas 
(PA). Numbers at the left indicate positions and sizes (in kb) of RNA 
markers. Specific RNA bands are indicated by arrows to the right of each 
panel. The EBI 1 probe detects faint 2.4 kb bands in lung and pancreas PNA. 
The EBI 2 probe detects an abundant 1.9 kb RNA in lung, and a faint 1.9 kb 
band in pancreas. The 2.7 kb IgU RNA is detected in lung, liver and 
pancreas preparations. The 1.5 kb band in placental RNA hybridized with 
IgU probe is residual signal from a previous hybridization. 

Figure 5. Complete nucleotide and deduced amino acid sequences of 
EBI 3 cDNA. The 1164 nucleotide EBI 3 cDNA contains a 690 nucleotide 
open reading frame encoding a 26 kD polypeptide. A hydrophobic amino 
terminal segment (bold underline) comprises a signal peptide for membrane 
translocation. No other hydrophobic segments that could potentially form a 
transmembrane domain are evident. Two potential asparagine-linked 
glycosylation sites are indicated (CHO###). The nucleotide sequence of the 
3' untranslated region bears significant homology with the human Alu repeat 
element (light underline). 



Figure 6. RNA blot hybridization analysis of EBI 3 gene expression. 
Polyadenylated RNA (4 to 12 /ig/lane) was size fractionated on formaldehyde 
agarose gel, transferred to an activated nylon membrane and hybridized with 
a 32P-labeled EBI 3 cDNA, actin and glyceraldehyde dehydrogenase 
(GAPDH) probes. RNA samples used in each lane are indicated at the top. 
(LCL is the EBV-immorialized primary B lymphoblastoid cell line, IB4;.BL 
is the EBV-negative Burkitt lymphoma cell line, BUI; -EBL is the 
EBV-infected Burkitt lymphoma cell line, B141/B95-8, derived by in vitro 
infection of BUI line.) An abundant 1.5 kb RNA is recognized by the EBI 3 
probe in both EBV-infected cell line RNA samples (LCL, EBL), but is 
undetectable in the EBV-negative cell sample (BL). Control hybridization with 
actin and GAPDH probes indicate that the BL lane contains as much or more 
RNA than the EBV-infected cell lanes. Dashes indicate positions of ribosomal 

RNA bands (18s, 28s). 

Figure 7. Expression of EBI 3 gene RNA in human tissues and cell 

lines. 

(A) EBI 3 or actin probes were hybridized to blots containing RNA 
from the cell lines or lymphoid tissues indicated at the top of each lane. 
(BUI and BL30 are EBV(-) Burkitt lymphoma cell lines; BU1/P3HRI is 
infected with the non-transforming P3HR1 strain of EBV; BU1/B95-8 is 
infected with the transforming, B95-8 strain of EBV; IB4 is a lympohblastoid 
cell line generated by transformation of primary B lymphocytes with B95-8 
virus; LCL-W91 EBV strain; TONSIL represents unfractionated cells from 
surgically excised human tonsil; PBMC is unfractionated peripheral blood 
mononuclear cells; PBMC-PWM is PBMC stimulated 72 h with pokeweed 
mitogen (2.5 ng/m\); PBT-PHA is T cells purified from PBMC by sheep 
erythrocyte resetting, stimulated 72 h with phytohemagglutinin (1.0 fig/mL); 
B MARR is post-mortem costal bone marrow; SPLEEN is unfractionated cells 
from surgically excised normal human spleen; HL60 is a promyelocytic 
leukemia cell line; U937 is a histiocytic lymphoma cell line with monocyte 
features; K562 is a chronic myelogenous leukemia cell line; Jurkat is a T cell 
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leukemia; TK143 is an osteosarcoma line. Each autoradiographic panel was 
generated from two separate blots. 

(B) EB1 3 was hybridized to a commercially prepared blot (Multiple 
Tissue Northern, Clontech, CA) containing polyadenylated RNA (2 jig/lane) 

5 from each of the following human tissues: heart (HE), brain BR), placenta 

(PL), lung (LU), liver (LI), kidney (Kl), skeletal muscle (SM) and pancreas 
(PA). The EBI 3 probe specifically detects an abundant 1.5 kb RNA in the 
placental RNA preparation (position indicated by arrow). A feint band of 
similar size if also observed in liver RNA. Numbers at the left indicate 

10 positions and sizes (in kb) of RNA markers. 

DEFINITIONS 

In the description that follows, a number of terms used in recombinant 
DNA (rDNA) technology are extensively utilized. In order to provide a clear 
and consistent understanding of the specification and claims, including the 
15 scope to be given such terms, the following definitions are provided. 

DNA segment . A DNA segment, as is generally understood and used 
herein, refers to a molecule comprising a linear stretch of nucleotides wherein 
the nucleotides are present in a sequence that may encode, through the genetic 
code, a molecule comprising a linear sequence of amino acid residues that is 
20 referred to as a protein, a protein fragment or a polypeptide. 

Gene . A DNA sequence related to a single polypeptide chain or 
protein, and as used herein includes the 5' and 3' untranslated ends. The 
polypeptide can be encoded by a full-length sequence or any portion of the 
coding sequence, so long as the functional activity of the protein is retained. 
25 A "complementary DNA" or "cDNA" gene includes recombinant genes 

synthesized by reverse transcription of messenger RNA ("mRNA"). 

Structural gene . A DNA sequence that is transcribed into mRNA that 
is then translated into a sequence of amino acids characteristic of a specific 
polypeptide. 
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Restriction Endonuclease . A restriction endonuclease (also restriction 
enzyme) is an enzyme that has the capacity to recognize a specific base 
sequence (usually 4, 5, or 6 base pairs in length) in a DNA molecule, and to 
cleave the DNA molecule at every place where this sequence appears. For 
example, EcoW. recognizes the base sequence GAATTC/CTTAAG. 

Restriction Fragment . The DNA molecules produced by digestion with 
a restriction endonuclease are referred to as restriction fragments. Any given 
genome may be digested by a particular restriction endonuclease into a discrete 
set of restriction fragments. 

A parose Gel Electrophoresis . To detect a polymorphism in the length 
of restriction fragments, an analytical method for fractionating double-stranded 
DNA molecules on the basis of size is required. The most commonly used 
technique (though not the only one) for achieving such a fractionation is 
agarose gel electrophoresis. The principle of this method is that DNA 
molecules migrate through the gel as though it were a sieve that retards the 
movement of the largest molecules to the greatest extent and the movement of 
the smallest molecules to the least extent. Note that the smaller the DNA 
fragment, the greater the mobility under electrophoresis in the agarose gel. 

The DNA fragments fractionated by agarose gel electrophoresis can be 
visualized directly by a staining procedure if the number of fragments included 
in the pattern is small. The DNA fragments of genomes can be visualized 
successfully. However, most genomes, including the human genome, contain 
far too many DNA sequences to produce a simple pattern of restriction 
fragments. For example, the human genome is digested into approximately 
1,000,000 different DNA fragments by EcoRl. In order to. visualize a small 
subset of these fragments, a methodology referred to as the Southern 
hybridization procedure can be applied. 

Southern Transfer Procedure . The purpose of the Southern transfer 
procedure (also referred to as blotting) is to physically transfer DNA 
fractionated by agarose gel electrophoresis onto a nitrocellulose filter paper or 
another appropriate surface or method, while retaining the relative positions 
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of DNA fragments resulting from the fractionation procedure. The 
methodology used to accomplish the transfer from agarose gel to nitrocellulose 
involves drawing the DNA from the gel into the nitrocellulose paper by 
capillary action. 

5 M,,ni ft ir Arid Hybridization . Nucleic acid hybridization depends on the 

principle that two single-stranded nucleic acid molecules that have 
complementary base sequences will reform the thermodynamically favored 
double-stranded structure if they are mixed under the proper conditions. The 
double-stranded structure will be formed between two complementary single- 

10 stranded nucleic acids even if one is immobilized on a nitrocellulose filter. In 

the Southern hybridization procedure, the latter situation occurs. As noted 
previously, the DNA of the individual to be tested is digested with a restriction 
endonuclease, fractionated by agarose gel electrophoresis, converted to the 
single-stranded form, and transferred to nitrocellulose paper, making it 

15 available for reannealing to the hybridization probe. 

H Y hridization Probe . To visualize a particular DNA sequence in the 
Southern hybridization procedure, a labeled DNA molecule or hybridization 
probe is reacted to the fractionated DNA bound to the nitrocellulose filter. 
The areas on the filter that carry DNA sequences complementary to the 

20 labeled DNA probe become labeled themselves as a consequence of the 

reannealing reaction. The areas of the filter that exhibit such labeling are 
visualized. The hybridization probe is generally produced by molecular 
cloning of a specific DNA sequence. 

Oli gonucleotide or Oligomer . A molecule comprised of two or more 

25 deoxyribonucleotides or ribonucleotides, preferably more than three. Its exact 

size will depend on many factors, which in turn depend on the ultimate 
function or use of the oligonucleotide. An oligonucleotide may be derived 
synthetically or by cloning. 

s* T »T>r* Am plification . A method for generating large amounts of a 

30 target sequence. In general, one or more amplification primers are annealed 
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to a nucleic acid sequence. Using appropriate enzymes, sequences found 
adjacent to, or in between the primers are amplified. 

Am plification Primer . An oligonucleotide which is capable of 
annealing adjacent to a target sequence and serving as an initiation point for 
5 DNA synthesis when placed under conditions in which synthesis of a primer 

extension product which is complementary to a nucleic acid strand is initiated. 

Vector . A plasmid or phage DNA or other DNA sequence into which 
DNA may be inserted to be cloned. The vector may replicate autonomously 
in a host cell, and may be further characterized by one or a small number of 
10 endonuclease recognition sites at which such DNA sequences may be cut in 

a determinable fashion and into which DNA may be inserted. The vector may 
further contain a marker suitable for use in the identification of cells 
transformed with the vector. Markers, for example, are tetracycline resistance 
or ampicillin resistance. The words "cloning vehicle" are sometimes used for 

15 "vector." 

Expression . Expression is the process by which a structural gene 
produces a polypeptide. It involves transcription of the gene into mRNA, and 
the translation of such mRNA into polypeptide(s). 

Ex pression vector . A vector or vehicle similar to a cloning vector but 
20 which is capable of expressing a gene which has been cloned into it, after 

transformation into a host. The cloned gene is usually placed under the 
control of (i.e., operably linked to) certain control sequences such as promoter 
sequences. 

Expression control sequences will vary depending on whether the 
25 vector is designed to express the operably linked gene in a prokaryotic or 

eukaryotic host and may additionally contain transcriptional elements such as 
enhancer elements, termination sequences, tissue-specificity elements, and/or 
translational initiation and termination sites. 

Functional Derivative . A "functional derivative" of a sequence, either 
30 protein or nucleic acid, is a molecule that possesses a biological activity (either 

functional or structural) that is substantially similar to a biological activity of 
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the protein or nucleic acid sequence. A functional derivative of a protein may 
or may not contain post-translational modifications such as covalently linked 
carbohydrate, depending on the necessity of such modifications for the 
performance of a specific function. The term "functional derivative" is 
intended to include the "fragments," "segments," "variants," "analogs," or 
"chemical derivatives" of a molecule. 

As used herein, a molecule is said to be a "chemical derivative" of 
another molecule when it contains additional chemical moieties not normally 
a part of the molecule. Such moieties may improve the molecule's solubility, 
absorption, biological half life, and the like. The moieties may alternatively 
decrease the toxicity of the molecule, eliminate or attenuate any undesirable 
side effect of the molecule, and the like.- Moieties capable of mediating such 
effects are disclosed in Remington's Pharmaceutical Sciences (1980). 
Procedures for coupling such moieties to a molecule are well known in the art. 

Fragment . A "fragment" of a molecule such as a protein or nucleic 
acid is meant to refer to any portion of the amino acid or nucleotide genetic 
sequence. 

Variant . A "variant" of a protein or nucleic acid is meant to refer to 
a molecule substantially similar in structure and biological activity to either a 
the protein or nucleic acid, or to a fragment thereof. Thus, provided that two 
molecules possess a common activity and may substitute for each other, they 
are considered variants as that term is used herein even if the composition or 
secondary, tertiary, or quaternary structure of one of the molecules is not 
identical to that found in the other, or if the amino acid or nucleotide sequence 
is not identical. 

Analog . An "analog" of a protein or genetic sequence is meant to refer 
to a protein or genetic sequence substantially similar in function to a protein 
or genetic sequence described herein. 

Allele . An "allele" is an alternative form of a gene occupying a given 
locus on the chromosome. 
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Mutation . A "mutation" is any detectable change in the genetic 
material which may be transmitted to daughter cells and possibly even to 
succeeding generations giving rise to mutant cells or mutant individuals. If the 
descendants of a mutant cell give rise only to somatic cells in multicellular 

5 organisms, a mutant spot or area of cells arises. Mutations in the germ line 

of sexually reproducing organisms may be transmitted by the gametes to the 
next generation resulting in an individual with the new mutant condition in 
both its somatic and germ cells. A mutation may be any (or a combination of) 
detectable, unnatural change affecting the chemical or physical constitution, 

10 mutability, replication, phenotypic function, or recombination of one or more 

deoxyribonucleotides; nucleotides may be added, deleted, substituted for, 
inverted, or transposed to new positions with and without inversion. 
Mutations may occur spontaneously and can be induced experimentally by 
application of mutagens. A mutant variation of a DNA segment results from 

15 a mutation. A mutant polypeptide may result from a mutant DNA segment. 

Species . A "species" is a group of actually or potentially interbreeding 
natural populations. A species variation within a DNA segment or protein is 
a change in the nucleic acid or amino acid sequence that occurs among species 
and may be determined by DNA sequencing of the segment in question. 

20 Substantially Pure . A "substantially pure" protein or nucleic acid is a 

protein or nucleic acid preparation that is generally lacking in other cellular 
components. 

DETAILED DESCRIPTION OF THE INVENTION 



25 



The present invention relates to novel DNA sequences, EBI 1, EBI 2, 
and EBI 3, which have been identified as Epstein Barr virus induced genes. 
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A. D NA segments ending for EBI 1 , EBI 2. and EBI 3 polypeptides, 
anH fragments thereof. 

In one embodiment, the present invention relates to a DNA segment 
coding for a polypeptide having an amino acid sequence corresponding to a 
polypeptide selected from the group consisting of EBI 1, EBI 2, and EBI 3 
polypeptides, or at least 7 contiguous amino acids thereof (preferably, at least 
10, 15, 20, or 30 contiguous amino acids thereof). In one preferred 
embodiment, the DNA segment comprises the sequences set forth in SEQ ID 
NO:l, SEQ ID NO:3, or SEQ ID NO:5; allelic, mutant or species variation 
thereof, or at least 20 contiguous nucleotides thereof (preferably at least 25, 
30, 40, or 50 contiguous nucleotides thereof). In another preferred 
embodiment, the DNA segment encodes an amino acid sequence selected from 
the group consisting of sequences set forth in SEQ ID NO:2, SEQ ID NO:4, 
and SEQ ID NO:6, or mutant or species variation thereof, or at least 7 
contiguous amino acids thereof (preferably, at least 10, 15, 20, or 30 
contiguous amino acids thereof). 

Also included within the scope of this invention are the functional 
equivalents of the herein-described DNA or nucleotide sequences. The 
degeneracy of the genetic code permits substitution of certain codons by other 
codons which specify the same amino acid and hence would give rise to the 
same protein. The DNA or nucleotide sequence can vary substantially since, 
with the exception of methionine and tryptophan, the known amino acids can 
be coded for by more than one codon. Thus, portions or all of the EBI 1, EBI 
2, or EBI 3 gene could be synthesized to give a DNA sequence significantly 
different from that shown in SEQ ID NO:l, SEQ ID NO:3, or SEQ ID NO:5. 
The encoded amino acid sequence thereof would, however, be preserved. 

In addition, the DNA or nucleotide sequence may comprise a 
nucleotide sequence which results from the addition, deletion or substitution 
of at least one nucleotide to the 5'-end and/or the 3'-end of the DNA formula 
shown in SEQ ID NO:l, SEQ ID NO:3, or SEQ ID NO:5 or a derivative 
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thereof. Any nucleotide or polynucleotide may be used in this regard, 
provided that its addition, deletion or substitution does not alter the amino acid 
sequence of SEQ ID NO:2, SEQ ID NO:4, or SEQ ID NO:6 which is 
encoded by the nucleotide sequence. For example, the present invention is 

5 intended to include any nucleotide sequence resulting from the addition of 

ATGasan initiation codon at the 5'-end of the inventive nucleotide sequence 
or its derivative, or from the addition of TTA, TAG or TGA as a termination 
codon at the 3'-end of the inventive nucleotide sequence or its derivative. 
Moreover, the DNA fragment of the present invention may, as necessary, 

10 have restriction endonuclease recognition sites added to its 5 '-end and/or 3'- 

end. 

Such functional alterations of a given DNA or nucleotide sequence 
afford an opportunity to promote secretion and/or processing of heterologous 
proteins encoded by foreign DNA sequences fused thereto. All variations of 

15 the nucleotide sequence of the EBI 1, EBI 2, and EBI 3 genes and fragments 

thereof permitted by the genetic code are, therefore, included in this invention. 

Further, it is possible to delete codons or to substitute one or more 
codons by codons other than degenerate codons to produce a structurally 
modified polypeptide, but one which has substantially the same utility or 

20 activity of the polypeptide produced by the unmodified DNA molecule. As 

recognized in the art, the two polypeptides are functionally equivalent, as are 
the two DNA molecules which give rise to their production, even though the 
differences beiween the DNA molecules are not related to degeneracy of the 
genetic code. 

25 A.l. Isolation of DNA . 

In one aspect of the present invention, DNA segments coding for 
polypeptides having amino acid sequences corresponding to EBI 1, EBI 2, and 
EBI 3 are provided. In particular, the DNA segment may be isolated from a 
biological sample containing RNA or DNA. 
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The DN A segment may be isolated from a biological sample containing 
RNA using the techniques of cDNA cloning and subtractive hybridization as 
previously described (Birkenbach et al, J. of Virology 63:9:4079-4084). The 
DNA segment may also be isolated from a cDNA library using a homologous 
5 probe. 

The DNA segment may be isolated from a biological sample containing 
genomic DNA or from a genomic library using techniques well known in the 
art Suitable biological samples include, but are not limited to, blood, semen 
and tissue. The method of obtaining the biological sample will vary depending 

10 upon the nature of the sample. 

One skilled in the art will realize that the human genome may be 
subject to slight allelic variations between individuals. Therefore, the isolated 
DNA segment is also intended to include allelic variations, so long as the 
sequence is a functional derivative of the EBI 1, EBI 2, or EBI 3 gene. 

15 one skilled in the art will realize that organisms other than humans 

may also contain EBI 1, EBI 2, or EBI 3 genes (for example, eukaryotes; 
more specifically, mammals, birds, fish, and plants; more specifically, 
gorillas, rhesus monkeys, and chimpanzees). The invention is intended to 
include, but not be limited to, EBI 1, EBI 2, and EBI 3 DNA segments 

.20 isolated from the above-described organisms. 

A.2. Synthesis of DNA . 

In the alternative, the DNA segment of the present invention may be 
chemically synthesized. For example, a DNA fragment with the nucleotide 
sequence which codes for the expression product of an EBI 1, EBI 2, or EBI 
25 3 gene may be designed and, if necessary, divided into appropriate smaller 

fragments. Then an oligomer which corresponds to the DNA fragment, or to 
each of the divided fragments, may be synthesized. Such synthetic 
oligonucleotides may be prepared, for example, by the triester method of 
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Matteucci et al., J. Am. Chem. Sac. 103:3185-3191 (1981) or by using an 
automated DNA synthesizer. 

An oligonucleotide may be derived synthetically or by cloning. If 
necessary, the 5'-ends of the oligomers may be phosphorylated using T4 
5 polynucleotide kinase. Kinasing of single strands prior to annealing or for 

labeling may be achieved using an excess of the enzyme. If kinasing is for the 
labeling of probe, the ATP may contain high specific activity radioisotopes. 
Then, the DNA oligomer may be subjected to annealing and ligation with T4 
ligase or the like. 

10 B. a substantially mi "> mi 1- EBI 2. and RBI 3 polypeptides.. 

In another embodiment, the present invention relates to a substantially 
pure polypeptide having an amino acid sequence corresponding to a 
polypeptide selected from the group consisting of EBI 1, EBI 2, and EBI 3 
polypeptides, or at least 7 contiguous amino acids thereof (preferably, at least 
15 10, 15, 20, or 30 contiguous amino acids thereof). In a preferred 

embodiment, the polypeptide has an amino acid sequence selected from the 
group consisting of sequences set forth in SEQ ID NO:2, SEQ ID NO:4, and 
SEQ ID NO:6, or mutant or species variation thereof, or at least 7 contiguous 
amino acids thereof (preferably, at least 10, 15, 20, or 30 contiguous amino 

20 acids thereof). 

A variety of methodologies known in the art can be utilized to obtain 
the peptide of the present invention. In one embodiment, the peptide is 
purified from tissues or cells which naturally produce the peptide. The 
samples of the present invention include cells, protein extracts or membrane 

25 extracts ofcells, or biological fluids. The sample will vary based on the assay 

format, the detection method and the nature of the tissues, cells or extracts 

used as the sample. 

Any eukaryotic organism can be used as a source for the peptide of the 
invention, as long as the source organism naturally contains such a peptide. 
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As used herein, "source organism" refers to the original organism from which 
the amino acid sequence of the subunit is derived, regardless of the organism 
the subunit is expressed in and ultimately isolated from. 

One skilled in the art can readily follow known methods for isolating 
5 proteins in order io obtain the peptide free of natural contaminants. These 

include, but are not limited to: immunochromotography, size-exclusion 
chromatography, HPLC, ion-exchange chromatography, and immuno-affinity 
chromatography. 

C. a n„rl ft i c acid P f» H» *™ detection of Epstein Barr virus. 

10 In another embodiment, the present invention relates to a nucleic acid 

probe for the detection of the presence of Epstein Barr Virus in a sample 
comprising the above-described DNA segments or at least 20 contiguous 
nucleotides thereof (preferably at least 25, 30, 40, or 50 thereof). In another 
preferred embodiment, the DNA segment has a nucleic acid sequence selected 

15 from the group consisting of sequences set forth in SEQ ID NO:l, SEQ ID 

NO:3, and SEQ ID NO:5, or at least 20 contiguous nucleotides thereof 
(preferably at least 25, 30, 40, or 50 thereof). In another preferred 
embodiment, the nucleic acid probe encodes an amino acid sequence selected 
from the group consisting of sequences set forth in SEQ ID NO:2, SEQ ID 

20 NO:4, and SEQ ID NO:6, or at least 7 contiguous amino acids thereof. 

' The nucleic acid probe may be used to probe an appropriate 
chromosomal or cDNA library by usual hybridization methods to obtain 
another DNA segment of the present invention. A chromosomal DNA or 
cDNA library may be prepared from appropriate cells according to recognized 

25 methods in the art (cf. Molecular Cloning: A Laboratory Manual, second 

edition, edited by Sambrook, Fritsch, & Maniatis, Cold Spring Harbor 

Laboratory, 1989). 

In the alternative, chemical synthesis is carried out in order to obtain 
nucleic acid probes having nucleotide sequences which correspond to N- 
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terminal and C-terminal portions of the amino acid sequence of the polypeptide 
of interest. Thus, the synthesized nucleic acid probes may be used as primers 
in a polymerase chain reaction (PCR) carried out in accordance with 
recognized PCR techniques, essentially according to PCR Protocols, A Guide 

; , 0 MtfA**W^^^ 199 °' 

utilizingmeappropriatecl.romosorr^orcDNAlibrary 

of the present invention. 

One skilled in the art can readily design such probes based on the 
sequence disclosed herein using methods of computer alignment and sequence 
3 analysis known in the art (cf. Molecular Cloning: A Moratory Manual. 

second edition, edited by Sambrook, Fritsch, & Maniatis, Cold Spring Harbor 

Laboratory, 1989). ' 

The hybridization probes of the present invention can be labeled by 
standard labeling techniques such as with a radiolabel, enzyme label, 
5 fluorescent label, biotin-avidin label, chemiluminesccnce, and the hke. After 

hybridization, the probes may be visualized using known methods. 

The nucleic acid probes of the present invention include RN A, as well 
as DNA probes, such probes being generated using techniques known .n the 

art, . , 

20 In one embodiment of the above described method, a nucletc and 

probe is immobilized on a solid support. Examples of such solid supports 

include, but are not limited to, plastics such as polycarbonate, complex 

carbohydrates such as agarose and sepharose, and acrylic resms, such as 

polyacrylamide and latex beads. Techniques for coupling nucleic acd probes 

25 to such solid supports are well known in the art. 

The test samples suitable for nucleic acid probing methods of the 

present invention include, for example, cells or nucleic acid extracts of cells, 

or biological fluids. The sample used in the above-described methods wdl 

vary based on the assay format, the detection method and the nature of the 

30 tissues, cells or extracts to be assayed. Methods for preparing nuclei acd 
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extracts of cells are well known in the art and can be readily adapted in order 
to obtain a sample which is compatible with the method utilized. 



D. A 



In another embodiment, the present invention relates to a method of 
5 detecting the presence of Epstein Barr virus in a sample comprising a) 

contacting said sample with the above-described nucleic acid probe, under 
conditions such that hybridization occurs, and b) detecting the presence of said 
probe bound to said DNA segment. One skilled in the art would select the 
nucleic acid probe according to techniques known in the art as described 
10 above. SamplestobetestedincludebutshouldnotbelimitedtoRNAsamples 

of human tissue. The presence of EBI 1, EB1 2, or EBI 3 may represent that 
the cells had been infected with the Epstein Barr virus. Increases in the 
amount of EBI 1, EBI 2, or EBI 3 RNA in a sample may also indicate the 
presence of or infection with the Epstein Barr virus. 

In another embodiment, the present invention relates to a kit for 
detecting the presence of Epstein Barr virus in a sample comprising at least 
one container means having disposed therein the above-described nucleic acid 
probe In a preferred embodiment, the kit further comprises other containers 

20 comprisingoneormoreofthefollowing: wash reagents and reagents capable 

of detecting the presence of bound nucleic acid probe. Examples of detection 
reagents include, but are not limited to radiolabelled probes, enzymatic labeled 
probes (horse radish peroxidase, , alkaline phosphatase), and affinity labeled 
probes (biotin, avidin, or steptavidin). 

25 In detail, a compartmentalized kit includes any kit in which reagents 

are contained in separate containers. Such containers include small glass 
containers, plastic containers or strips of plastic or paper. Such contamers 
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allow the efficient transfer of reagents from one compartment to another 
compartment such that the samples and reagents are not cross-contaminated 
and the agents or solutions of each container can be added in a quantitative 
fashion from one compartment to another. Such containers will include a 
container which will accept the test sample, a container which contains the 
probe or primers used in the assay, containers which contain wash reagents 
(such as phosphate buffered saline, Tris4>uffers, and the like), did containers 
which contain the reagents used to detect the hybridized probe, bound 
antibody, amplified product, or the like. 

One skilled in the art will readily recognize that the nucleic acid probes 
described in the present invention can readily be incorporated into one of the 
established kit formats which are well known in the art. 

F . nvA Mn ^r« «W t *» ™ 1 RBI 7 or FRT * DNA Segme " tS 
onH rrik r-nntaining mnstnicts. 

In another embodiment, the present invention relates to a recombinant 
DNA molecule comprising. 5' to 3'. a promoter effective to initiate 
transcription in a host cell and the above-described DNA segments. 

In another embodiment, the present invention relates to a recombinant 
DNA molecule comprising a vector and an above-described DNA segment. 

In another embodiment, the present invention relates to a DNA 
molecule comprising a transcriptional region functional in a cell, a sequence 
complimentary to an RNA sequence encoding an amino acid sequence 
corresponding to the above-described polypeptide, and a transcnpt»onal 
termination region functional in said cell. 

Preferably, the above-described molecules are isolated and\or purified 

DNA molecules. 

In another embodiment, the present invention relates to a cell that 
contains an above-described DNA molecule. 
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In another embodiment, the peptide is purified from cells which have 
been altered to express the peptide. 

As used herein, a cell is said to be "altered to express a desired 
peptide" when the cell, through genetic manipulation, is made to produce a 
5 protein which it normally does not produce or which the cell normally 

produces at low levels. One skilled in the art can readily adapt procedures for 
introducing and expressing either genomic, cDN A, or synthetic sequences into 
either eukaryotic or prokaryotic cells. 

A nucleic acid molecule, such as DNA, is said to be "capable of 
10 expressing" a polypeptide if it contains nucleotide sequences which contain 

transcriptional and translational regulatory information and such sequences are 
"operably linked" to nucleotide sequences which encode the polypeptide. An 
operable linkage is a linkage in which the regulatory DNA sequences and the 
DNA sequence sought to be expressed are connected in such a way as to 
15 permit gene sequence expression. The precise nature of the regulatory regions 

needed for gene sequence expression may vary from organism to organism, 
but shall in general include a promoter region which, in prokaryotes, contains 
both the promoter (which directs the initiation of RNA transcription) as well 
as the DNA sequences which, when transcribed into RNA, will signal 
20 synthesis initiation. Such regions will normally include those 5'-non-coding 

sequences involved with initiation of transcription and translation, such as the 
TATA box, capping sequence, CAAT sequence, and the like. 

If desired, the non-coding region 3' to the sequence encoding an EBI 

1, EBI 2, or EBI 3 gene may be obtained by the above-described methods. 
25 This region may be retained for its transcriptional termination regulatory 

sequences, such as termination and polyadenylation. Thus, by retaining the 
3'-region naturally contiguous to the DNA sequence encoding an EBI 1, EBI 

2, or EBI 3 gene, the transcriptional termination signals may be provided. 
Where the transcriptional termination signals are not satisfactorily functional 

30 in the expression host cell, then a 3' region functional in the host cell may be 

substituted. 
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Two DNA sequences (such as a promoter region sequence and an EBI 

1, EBI 2, or EBI 3 sequence) are said to be operably linked if the nature of 
the linkage between the two DNA sequences does not (1) result in the 
introduction of a frame-shift mutation, (2) interfere with the ability of the 
promoter region sequence to direct the transcription of an EBI 1, EBI 2, EBI 
3 gene sequence, or (3) interfere with the ability of the an EBI 1, EBI 2, or 
EBI 3 gene sequence to be transcribed by the promoter region sequence. 
Thus, a promoter region would be operably linked to a DNA sequence if the 
promoter were capable of effecting transcription of that DNA sequence. 

Thus, to express an EBI 1, EBI 2, or EBI 3 gene, transcriptional and 
translation^ signals recognized by an appropriate host are necessary. 

The present invention encompasses the expression of the EBI 1, EBI 

2, or EBI 3 gene (or a functional derivative thereof) in either prokaryotic or 
eukaryotic cells. Prokaryotic hosts are, generally, the most efficient and 
convenient for the production of recombinant proteins and, therefore, are 
preferred for the expression of the EBI 1, EBI 2, or EBI 3 gene. 

Prokaryotes most frequently are represented by various strains of 
E. coU. However, other microbial strains may also be used, including other 
bacterial strains. 

In prokaryotic systems, plasmid vectors that contain replication s.tes 
and control sequences derived from a species compatible with the host may be 
used. Examples of suitable plasmid vectors may include pBR322, pUC18, 
P UC19 and the like; suitable phage or bacteriophage vectors may include 
XgtlO, Xgtll and the like; and suitable virus vectors may include pMAM-neo, 
pKRC and the like. Preferably, the selected vector of the present invention 
has the capacity to replicate in the selected host cell. 

Recognized prokaryotic hosts include bacteria such as£. coli, Bacillus, 
Streptomyces, Pseudomonas, Salmonella, Serratia, and the like. However, 
under such conditions, the peptide will not be glycosylated. The prokaryotic 
30 host must be compatible with the replicon and control sequences in the 

expression plasmid. 
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To express EBI 1. 2, or 3 (or a functional derivative thereof) in a 
prokaryotic cell, it is necessary to operably link the EBI 1. EBI 2, or EBI 3 
sequence to a functional prokaryotic promoter. Such promoters may be ether 
constitutor, more preferably, regulatable (i.e., inducible or derepressible). 
Examples of constitutive promoters include the/*/ promoter of bactenophage 
X the bla promoter of the ^-lactamase gene sequence of P BR322, and the 
CAT promoter of the chloramphenicol acetyl transferase gene sequence of 
P PR325, and the like. Examples of inducible prokaryotic promoters include 
the major right and left promoters of bacteriophage X (P L and P R ), the trp, 
recA, lacZ, lad, and gal promoters of E. coU, the a-amylase (Ulmanen et al., 
J Bacterial. 162:V6~m (1985)) and the c-28-specific promoters of B. 
subMs (Oilman et al., Gene sequence 32:11-20 (1984)), the promoters of the 
bacteriophagesofBaa-fl« S (Gryczan, In: The Molecular Biology of the Baalli, 
Academic Press, Inc., NY (1982)), and Streptomyces promoters (Ward et al., 
15 Mol. Gen. Genet. 203:468-478 (1986)). 

Prokaryotic promoters are reviewed by Click (/. Ind. Microbiol. 1 :277- 
282 (1987)); Cenatiempo 0iochime 68:505-516 (1986)); and Gottesman (Am. 
. Rev. Genet. J8:415^42 (1984)). 

Proper expression in a prokaryotic cell also requires the presence of a 

ribosomebindingsiteups SuCh 
ribosome binding sites are disclosed, for example, by Gold et al. Qinn. Rev. 

Microbiol. 35:365-404 (1981)). 

The selection of control sequences, expression vectors, transformation 
methods, and the like, are dependent on the type of host cell used to express 
the gene. As used herein, "cell", "cell line", and "cell culture" may be used 
interchangeably and all such designations include progeny. Thus, the words 
"transformants" or "transformed cells" include the primary subject cell and 
cultures derived thercfrom, without regard to the number of transfers. It »s 
also understood that all progeny may not be precisely identical ,n DNA 
content, due to deliberate or inadvertent mutations. However, as defined, 
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mutant progeny have the same functionality as that of the originally 
transformed cell. 

Host cells which may be used in the expression systems of the present 
invention are not strictly limited, provided that they are suitable for use in the 
5 expression of the EBI 1, 2, or 3 peptide of interest Suitable hosts may often 

include eukaryotic cells. 

Preferred eukaryotic hosts include, for example, yeast/fungi, insect 
cells, mammalian cells either in vivo, or in tissue culture. Mammalian cells 
which may be useful as hosts include HeU cells, cells of fibroblast origin 
10 such as VERO or CHO-K1, or cells of lymphoid origin, such as the 

hybridoma SP2/0-AG14 or the myeloma P3x63Sg8, and their derivatives. 
Preferred mammalian host cells include SP2/0 and J558L, as well as 
neuroblastoma cell lines such as IMR 332 that may provide better capacities 
for correct post-translational processing. 
15 In addition, plant cells are also available as hosts, and control 

sequences compatible with plant cells are available, such as the nopaline 
synthase promoter and polyadenylation signal sequences. 

Another preferred host is an insect cell, for example the Drosophila 
larvae. Using insect cells as hosts, the Drosophila alcohol dehydrogenase 
20 promoter can be used. Rubin, Science 240.U53-U59 (1988). Alternatively, 

baculovirus vectors can be engineered to express large amounts of EBI 1, EBI 
2, or EBI 3 in insects cells (Jasny, Science 238:1653 (1987); Miller et al, In: 
Genetic Engineering (1986), Setlow, J.K., et aL, eds., Plenum, Vol. 8, pp. 
277-297). 

25 Any of a series of yeast gene sequence expression systems can be 

utilized which incorporate promoter and termination elements from the actively 
expressed gene sequences coding for glycolytic enzymes are produced in large 
quantities when yeast are grown in mediums rich in glucose. Known glyco- 
lytic gene sequences can also provide very efficient transcriptional control 

30 signals. 
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Yeast provides substantial advantages in that it can also carry out post- 
transladonal peptide modifications. A number of recombinant DNA strategies 
exist which utilize strong promoter sequences and high copy number of 
plasmids which can be utilized for production of the desired proteins in yeast. 
Yeast recognizes leader sequences on cloned mammalian gene sequence 
products and secretes peptides bearing leader sequences (i.e., pre-peptides). 
For a mammalian host, several possible vector systems are available for the 
expression of EBI 1, EBI 2, or EB1 3. 

A wide variety of transcriptional and translational regulatory sequences 
maybeemployed.dependinguponthenatureofthehost. The transcriptional 
and translational regulatory signals may be derived from viral sources, such 
as adenovirus, bovine papilloma virus, .simian virus, or the like, where the 
regulatory signals are associated with a particular gene sequence which has a 
high level of expression. Alternatively, promoters from mammalian 
expression products, such as actin, collagen, myosin, and the like, may be 
employed. Transcriptional initiation regulatory signals may be selected wh,ch 
allow for repression or activation, so that expression of the gene sequences can 
be modulated. Of interest are regulatory signals which are temperature- 
sensitive so that by varying the temperature, expression can be repressed or 
initiated, or are" subject to chemical (such as metabolite) regulation. 

As discussed above, expression of EBI 1, EBI 2, or EBI 3 in 
eukaryotic hosts requires the use of eukaryotic regulatory regions. Such 
regions will, in general, include a promoter region sufficient to direct the 
initiation of RNA synthesis. Preferred eukaryotic promoters include, for 
example, the promoter of the mouse metallothionein I gene sequence (Hamer 
etal., J. Mol. Appi Gen. i:273-288 (1982)); the TK promoter of Herpes 
virus'(McKnight, Cell 52:355-365 (1982)); the SV40 early promoter (Benoist 
et al mure (London) 290:304-310 (1981)); the yeast gal4 gene sequence 
promoter (Johnston* */.. Proc. Natl. Acad. Sci. (USA) 79:6971-6975(1982); 
30 Silver et al., Proc. Natl. Acad. Sci. (USA) «J:5951-5955 (1984)). 
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As is widely known, translation of eukaryotic mRNA is initiated at the 
codon which encodes the first methionine. For this reason, it is preferable to 
ensure that the linkage between a eukaryotic promoter and a DNA sequence 
which encodes EBI 1, EBI 2, or EBI 3 (or a functional derivative thereof) 
does not contain any intervening codons which are capable of encoding a 
methionine (U.. AUG). The presence of such codons results either, m a 
formation of a fusion protein (if the AUG codon is in the same reading frame 
as the EBI 1, EBI 2, or EBI 3 coding sequence) or a frame-shift mutation (if 
the AUG codon is not in the same reading frame as the EBI 1, EBI 2, or EBI 

3 coding sequence). 

An EBI 1, EBI 2, or EBI 3 DNA segment and an operably linked 
promoter may be introduced into a recipient prokaryotic or eukaryotic cell 
either as a non-replicating DNA (or RNA) molecule, which may either be a 
linear molecule or, more preferably, a closed covalent circular molecule. 
Since such molecules are incapable of autonomous replication, the expression 
of the gene may occur through the transient expression of the introduced 
sequence. Alternatively, permanent expression may occur through the 
integration of the introduced DNA sequence into the host chromosome. 

In one embodiment, a vector is employed which is capable of 
integrating the desired gene sequences into the host cell chromosome. Cells 
which have stably integrated the introduced DNA into their chromosomes can 
be selected by also introducing one or more markers which allow for selection 
of host cells which contain the expression vector. The marker may provide 
for prototrophy to an auxotrophic host, biocide resistance, e.g., antibiotics, or 
heavy metals, such as copper, or the like. The selectable marker gene 
sequence can either be directly linked to the DNA gene sequences to be 
expressed, or introduced into the same cell by co-transfection. Additional 
elements may also be needed for optimal synthesis of single chain binding 
protein mRNA. These elements may include splice signals, as well as tran- 
scription promoters, enhancers, and termination signals. cDNA expression 



WO 94/12519 



-29- 



PCT/US93/09636 



10 



15 



20 



25 



30 



vectors incorporating such elements include those described by Okayama. 

Molec. Cell. Biol. 3:280 (1983). 

In a preferred embodiment, the introduced sequence will be 
incorporated into a plasmid or viral vector capable of autonomous replication 
in the recipient host. Any of a wide variety of vectors may be employed for 
mis purpose. Factors of importance in selecting a particular plasmid or ^ral 
vector include: the ease with which recipient cells that contain the vector may 
be recognized and selected from those recipient cells which do not contam the 
vector the number of copies of the vector which are desired- in a particular 
host; and whether it is desirable to be able to "shuttle" the vector between host 
cells of different species. Preferred prokaryotic vectors include plasnnds such 
as those capable of replication in E. eott (such as, for example, P BR322, 
ColEl pSClOl, pACYC 184, *VX. Such plasmids are, for example, 
disclosed by SBxabroo^cf. Molecular Cloning: A Laboratory Manual, second 
edition edited by Sambrook, Fritsch, & Maniatis, Cold Spring Harbor 
Laboratory, 1989)). 5 a c///«5plasmidsincludepC194,pC221,pT127,andthe 
like Such plasmids are disclosed by Gryczan (In: The Molecular Biology of 
the Bacilli, Academic Press, NY (1982), pp. 307-329). Suitable Streptomyces 
p,asmids include pUlOl (Kendall et al., J. Bacterial. 769:4177-4183 (1987)), 
and streptomyces bacteriophages such as *C31 (Chater etal.. In: Svcth 
International Symposium on Actinomycetales Biology, Akademia, Kaido, 
Budapest, Hungary (1986), pp. 45-54). Pseudomonas plasmids are reviewed 
by John et al. (Rev. Infect. Dis. 8:693-704 (1986)), and Izaki (Jpn. J. Bac- 

terioL 33:729-742 (1978)). 

Preferred eukaryotic plasmids include, for example, BPV, vaccinia, 
SV40, 2-micron circle, and the like, or their derivatives. Such plasmids are 
well taiown in the art (Botstein et al, Miami Wntr. SymP- (1982); 
Broach In: The Molecular Biology of the Yeast Saccharomyces: UfeCycle 
andmeritance, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 
p. 445-470 (1981); Broach, Cell 28:203-204 (1982); Bollon etal., J. Clin. 
Hematol. Oncol. 70:39-48 (1980); Maniatis, In: Cell Biology: A 
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Comprehensive Treatise, Vol. 3, Gene Sequence Expression. Academic Press. 

NY, pp. 563-608 (1980)). 

Once the vector or DNA sequence containing the construct® has been 
prepared for expression, the DNA construct(s) may be introduced into an 
5 appropriate host cell by any of a variety of suitable means, i.e., 

transformation, transfection, conjugation, protoplast fusion, election, 
calciumphospr^,^^^ 

introduction of the vector, recipient cells are grown in a selective mednrm, 
which selects for the growth of vector-containing cell, Expression of the 

10 cloned gene sequence(s) results in the production of EBI 1, EBI 2, or EBI 3. 

or fragments thereof. This can take place in the transformed cells as such, or 
following the induction of these cells- to differentiate (for example, by 
administration of bromodeoxyuracil to neuroblastoma cells or the hke). 

A variety of incubation conditions can be used to form the peptide of 

15 the present invention. The most preferred conditions are those which num.c 

physiological conditions. 



20 



25 




In another embodiment, the present invention relates to an antibody 
having binding affinity to a polypeptide having an amino acid sequence 
selected from the group consisting of EBI 1, EBI 2, and EBI 3 polypeptides, 
or a binding fragment thereof. In a preferred embodiment, the polypeptide 
has an amino acid sequence selected from the group of sequences set forth m 
SEQ ID NO:2, SEQ ID NO:4, and SEQ ID NO:6, or mutant or spec.es 
variation thereof, or at least 7 contiguous amino acids thereof (preferably, at 
least 10, 15, 20, or 30 contiguous amino acids thereof). In another preferred 
embodiment, the antibody is a monoclonal antibody. 
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In another embodiment, the present invention relates to a hybridoma 
which produces the above-described monoclonal antibody, or binding fragment 
thereof. 

The EBI 1, EBI 2, or EBI 3 proteins of the present invention can be 
5 used in a variety of procedures and methods, such as for the generation of 

antibodies, for use in identifying pharmaceutical compositions, and . for 
studying DNA/protein interaction. 

The EBI 1, EBI 2, or EBI 3 peptide of the present invention can be 
used to produce antibodies or hybridomas. One skilled in- the art will 
10 recognize that if an antibody is desired, such a peptide would be generated as 

described herein and used as an immunogen. 

The antibodies of the present invention include monoclonal and 
polyclonal antibodies, as well fragments of these antibodies, and humanized 
forms. Humanized forms of the antibodies of the present invention may be 
15 generated using one of the procedures known in the art such as chimerization 

or CDR grafting. 

The invention also provides hybridomas which are capable of 
producing the above-described antibodies. A hybridoma is an immortalized 
cell line which is capable of secreting a specific monoclonal antibody. 

20 In general, techniques for preparing monoclonal antibodies and 

hybridomas are well known in the art (Campbell, 'Monoclonal Antibody 
Technology: Laboratory Techniques in Biochemistry and Molecular Biology' 
Elsevier Science Publishers, Amsterdam, The Netherlands (1984); St. Groth 
et al., 7. Immunol. Methods 55:1-21 (1980)). 

25 Any animal (mouse, rabbit, and the like) which is known to produce 

antibodies can be immunized with the selected polypeptide. Methods for 
immunization are well known in the art. Such methods include subcutaneous 
or interperitoneal injection of the polypeptide. One skilled in the art will 
recognize that the amount of polypeptide used for immunization will vary 

30 based on the animal which is immunized, the antigenicity of the polypeptide 

and the site of injection. 
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The polypeptide may be modified or administered in an adjuvant in 
order to increase the peptide antigenicity. Methods of increasing the 
antigenicity of a polypeptide are well known in the art. Such procedures 
include coupling the antigen with a heterologous protein (such as globulin or 

5 ^galactosidase)or±^ 

For monoclonal antibodies, spleen cells from the immunized animals 
are removed, fused with myeloma cells, such as SP2/0-Agl4 myeloma cells, 
and allowed to become monoclonal antibody producing hybridoma cells. 

Any one of a number of methods well known in the art can be used to 
10 identify the hybridoma cell which produces an antibody with the desired 

characteristics. These include screening the hybridomas with an ELISA assay, 
western blot analysis, or radioimmunbassay (Lutz etal., Exp. Cell Res. 

i75:109-124 (1988)). 

Hybridomas secreting the desired antibodies are cloned and the class 
15 and subclass is determined using procedures known in the art (Campbell, 

MonocbnalAntibodyTechnology: Laboratory Techniques in Biochemistry and 

Molecular Biology, supra (1984)). 

For polyclonal antibodies, antibody containing antisera is isolated from 
the immunized animal and is screened for the presence of antibodies with the 
20 desired specificity using one of the above-described procedures. 

In another embodiment of the present invention, the above-described 
antibodies are detectably labeled. Antibodies can be detectably labeled 
through the use of radioisotopes, affinity labels (such as biotin, avidin, and the 
like), enzymatic labels (such as horse radish peroxidase, alkaline phosphatase, 
and Ihe like) fluorescent labels (such as FITC or rhodamine, and the like), 
paramagnetic atoms, and the like. Procedures for accomplishing such labeling 
are well-known in the art, for example, see (Sternberger et al, J.Histochem. 
Cytochem. 15:315 (1970); Bayer et al., Meth. Enzym. (52:308 (1979); Engval 
et al., Immunol. 109:129 (1972); Coding, J. Immunol. Meth. 23:215 (1976)). 
30 The labeled antibodies of the present invention can be used for in vitro, in 
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*», and in situ assays to identify cells or tissues which express a specific 
peptide. 

In another embodiment of the present invention the above-described 
antibodies are immobilized on a solid support. Examples of such solid 
5 supports include plastics such as polycarbonate, complex carbohydrates such 

as agarose and sepharose, acrylic resins and such as polyacrylamide and latex 
beads. Techniques for coupling antibodies to such solid supports are well 
known in the art (Weir et al., 'Handbook of Experimental Immunology' 4th 
Ed., Blackwell Scientific Publications, Oxford, England, Chapter 10 (1986); 
10 Jacoby etal., Meth. Enzym. 34 Academic Press, N.Y. (1974)). The 

immobilized antibodies of the present invention can be used for in vitro, in 
viw, and in situ assays as well as in immunochromotography. 

Furthermore, one skilled in the art can readily adapt currently available 
procedures, as well as the techniques, methods and kits disclosed above with 
15 regard to antibodies, to generate peptides capable of binding to a specific 

peptide sequence in order to generate rationally designed antipeptide peptides, 
for example see Hurby et al., "Application of Synthetic Peptides: Antisense 
Peptides", In Synthetic Peptides, A User's Guide, W.H. Freeman, NY, pp. 
289-307 (1992), and Kaspczak et al., Biochemistry 28:9230-8 (1989). 
20 Anti-peptide peptides can be generated in one of two fashions. First, 

the anti-peptide peptides can be generated by replacing the basic amino acid 
residues found in the EBI 1, EBI 2, or EBI 3 peptide sequence with acidic 
residues, while maintaining hydrophobic and uncharged polar groups. For 
example, lysine, arginine, and/or histidine residues are replaced with aspartic 
25 acid or glutamic acid and glutamic acid residues are replaced by lysine, 

arginine or histidine. 

u a n,.thnH nf Meeting ™ vm 1 FBI 2. or FBI S polypeptide in a sample. 

In another embodiment, the present invention relates to a method of 
detecting a polypeptide selected from the group consisting of EBI 1. EBI 2, 
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EBI 3 in a sample, comprising: a) contacting the sample with an above- 
described antibody, under conditions such that immunocomplexes form, and 
b)detectingraepresenc*ofsaid^ Indetai, « 
the methods comprise incubating a test sample with one or more of the 
5 antibodies of the present invention and assaying whether the antibody binds to 

the test sample. The presence of an EBI 1, EBI 2, or EBI 3 polypeptide or 
fragment thereof in a sample may indicate the presence or infection of Epstein 
Barr virus. 

Conditions for incubating an antibody with a test sample vary. 
10 Incubation conditions depend on the format employed in the assay, the 

detection methods employed, and the type and nature of the antibody used in 
the assay. One skilled in the art will recognize that any one of the commonly 
available immunological assay formats (such as radioimmunoassays, enzyme- 
linked immunosorbent assays, diffusion based Ouchterlony, or rocket 
15 immunofluorescent assays) can readily be adapted to employ the antibodies of 

the present invention. Examples of such assays can be found in Chard, "An 
Introduction to Radioimmunoassay and Related Techniques' Elsevier Science 
Publishers, Amsterdam, The Netherlands (1986); Bullock et al., "Techniques 
inlnrnmnocytochemistry," Academic Press, Orlando, FL Vol. 1 (1982), Vol. 
20 2 (1983), Vol. 3 (1985); Tijssen, "Practice and Theory of Enzyme 

Immunoassays: Laboratory Techniques in Biochemistry and Molecular 
Biology," Elsevier Science Publishers, Amsterdam, The Netherlands (1985). 

The immunological assay test samples of the present invention include 
cells, protein or membrane extracts of cells, or biological fluids such as blood, 
25 serum, plasma, or urine. The test sample used in the above-described method 

will vary based on the assay format, nature of the detection method and the 
tissues, cells or extracts used as the sample to be assayed. Methods for 
preparing protein extracts or membrane extracts of cells are well known in the 
art and can be readily be adapted in order to obtain a sample which is capable 
30 with the system utilized. 
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I. A diagnostic kit u prising ant i ^i^ to EBI 1 FBI 2, and EBI 3 . 

In another embodiment of the present invention, a kit is provided which 
contains all the necessary reagents to carry out the previously described 
methods of detection. The kit may comprise: i) a first container means 

5 containing an above-described antibody, and ii) second container means 

containing a conjugate comprising a binding partner of the antibody and a 
label . In another preferred embodiment, the kit further comprises one or more 
other containers comprising one or more of the following: wash reagents and 
reagents capable of detecting the presence of bound antibodies. Examples of 

10 detection reagents include, but are not limited to, labeled secondary antibodies, 

or in the alternative, if the primary antibody is labeled, the chromophoric, 
enzymatic, or antibody binding reagents which are capable of reacting with the 
labeled antibody. The compartmentalized kit may be as described above for 

nucleic acid probe kits. 
15 One skilled in the art will readily recognize that the antibodies 

described in the present invention can readily be incorporated into one of the 
established kit formats which are well known in the art. 

The present invention is described in further detail in the following 

non-limiting examples. 
20 EXAMPLES 

The following protocols and experimental details are referenced in the 

examples that follow: 

Cells and cell lines. BL41 and BL30 are EBV(-) Burkitt lymphoma 
cell lines. The BU1/B95-8 and BL41/P3HR1 cell lines were derived by 
25 infecting BL41 with the transforming EBV strain, B95-8, or with the non- 

transforming strain, P3HR1, respectively (Favrot, M.C., et al.Jntl. J. Cancer 
38(6):90l-6 (1986)). IB4 is a latently infected B lymphoblastoid cell line 
established by infection of B lymphocytes with EBV (B95-8) in vitro. 
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RHEK-1 (generous gift from Dr. Jong Rhim, National Cancer Institute, 
Bethesda, MA) is a human keratinocyte line derived by infection of primary 
foreskin epithelial cells with an adenovirus 12/SV40 hybrid-virus. K562 is a 
Philadelphia chromosome-positive human chronic myeloid leukemia cell line. 
U937 is a histiocytic lymphoma cell line with monocytic features. HL60 is a 
promyelocyte leukemia line. HSB-2 and Jurkat are human T lymphoblastic 
leukemia cell lines. TK143 was derived from a human osteosarcoma. 

Human mononuclear cells (PBMC) were purified from peripheral blood 
by centrifugation on a ficoll cushion (Ficoll-Hypague, Pharmacia, Vineland, 
NJ). Cells were resuspended at lxlO 6 cells/ml in RPMI medium 
supplemented with 20% fetal bovine serum, and were divided into parallel 
cultures grown 72 h with or without 2.5 /tg/ml pokeweed mitogen (PWM, 
Sigma, St. Louis, MO). T cells were isolated from purified PBMCs by 
resetting overnight with aminoethylisothiouronium bromide (AET) treated 
sheep erythrocytes at 4°C, followed by centrifugation over ficoll. Pelleted 
erythrocytes were lysed with ammonium chloride. The remaining T cells were 
resuspended in RPMI with 20% fetal bovine serum at lxlO 6 cells/ml. 
Phytohemagglutinin (PHA, Sigma) was added to a final concentration of 1.0 
/ig/ml. Cells were cultured for 72 h and harvested for extraction of total 
cellular RNA. 

RNA preparation and analysis. Cytoplasmic RNA was isolated from 
exponentially growing cells by a modification of the acid phenol/guanidinium 
isothiocyanateextractionprocedure, followedby reprecipitation inguanidinium 
hydrochloride/ethanol. Total cellular RNA was extracted from 0.2 to 2 g 
samples of human spleen and tonsil obtained from surgical specimens, and 
from human postmortem bone marrow. Tissues were homogenized in acid 
phenol/guanidinium isothiocyanate using a rotary tissue homogenizer, extracted 
and precipitated. After dissolution in guanidinium hydrochloride and 
reprecipitation with ethanol, human tissue RNA samples were resuspended in 
H 2 0 and precipitated by addition of an equal volume of 8 M LiCl. The 
polyadenylated fractions of BL41 or BL41/B95-8 RNA were purified by 
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2 successive cycles of chromatography on oligodeoxythymidylate cellulose. 
Polyadenylated IB4 RNA was purified by a single round of 
oligodeoxythymidylate selection. RNA samples (12 M g per lane) were size 
fractionated on 0.66 M formaldehyde, \% agarose gels and transferred to 
5 charged nylon membranes (GeneScreen Plus, New England Nuclear, BiUerica, 

MA) for subsequent hybridization analysis. To examine gene expression in 
other human tissues, a commercially prepared blot was purchased containing 
2 /tg of polyadenylated heart, brain, placenta, lung, liver, kidney, 
skeletal muscle and pancreas RNA (Multiple Tissue Northern, Clontech, Palo 

10 Alto, CA). 

Probes were prepared from cloned cDNA inserts using random 
hexamer primers and 32 P-dCTP. The beta actin probe was generated using 
a previously described 1.4 kb cDNA (Alfieri, C, et al., Virology 181(2):595- 
608 (1991)). The glyceraldehyde phosphate dehydrogenase (GAPDH) probe 

15 was prepared from a commercially obtained DNA fragment (Clontech). 

Filters were hybridized for 18 to 24 h at 47°C in a hybridization buffer 
consisting of 50% formamide, 6X SSPE (20X SSPE: 3.0 M NaCl, 200 mM 
NaP04, pH7.4, 20 mM EDTA), 1% SDS, IX Denhardt's solution (100X 
Denhardfs: 2% BSA, 2% polyvinylpyrrolidone, 2% Ficoll), and 100 /ig/ml 

20 sheared single-stranded herring testis DNA. Filters were washed according 

to the manufacturers' instructions, with high stringency washes performed at 
67°-70°C in 1% SDS, 0.2X SSC, and exposed to preflashed film (X-OMAT 
AR, Kodak, Rochester, NY) at -80°C for 2 h to 10 days. Autoradiographic 
signal intensities were quantitated by densitometric scanning using a Beckman 

25 DU-8 spectrophotometer equipped with a slab gel Compuset Module. 

Induction factors were calculated for each probe as signal intensity ratios for 
EBV(+) versus EBV(-) cells, divided by the ratio of beta actin signal 
intensities. 

cDNA library preparation. First strand cDNA was prepared from 5 
30 M g polyadenylylated BL41/B95-8 RNA using Moloney murine leukemia virus 

reverse transcriptase (Superscript, Bethesda Research Laboratories, 
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Gaithersburg, MD) and oligodeoxthyraidylate primers in a 100 /iL reaction. 
Second strand cDNA was synthesized using E. coU DNA polymerase 1 and 
RNAse H. The double stranded cDNA was blunt-ended with T4 DNA 
polymerase and EcoRI methylated. After ligation of EcoRI linkers, the cDNA 
5 was EcoRI restriction digested and size fractionated by gel filtration 

chromatography on Sepharose CL4B. The purified cDNA was ligated to 
phosphorylated lambda gtlO arms (Promega, Madison, WI) and packaged 
(Gigapack Gold, Stratagene, La Jolla, CA). 

Subtractive probe preparation. Radiolabeled cDNA was prepared 
10 from 6 m of polyadenylylated BUI or BL41/B95-8 RNA in a 200 

reaction containing 50 /ig/ml random DNA hexamers; 0.5 mM dATP, dGTP, 
dTTP; 25 M M unlabeled dCTP; 1.0 mCi 32 P-dCTP (800 Ci/mMole, New 
England Nuclear); 2000 units recombinant Moloney murine leukemia virus 
reverse transcriptase. Reactions were 42«C for 1 h. After precipitation, 
15 reaction products were resuspendedin0.1 M NaOH and incubated 20 min. at 

65°C to hydrolyze RNA templates. Probes were neutralized with 0.1 M 
acetic acid and size fractionated on G-50 Sephadex. Biotinylated RNA was 
prepared from polyadenylylated BL41 RNA usingaphotoactivatableazido-aryl 
biotin reagent (Photoprobe Biotin, Vector Laboratories, Burlingame, CA) 
20 following the manufacturer's protocol. Probe fractions were combined with 

48 /zg (for BL41/B95-8 probe) or 12 f»g (for BL41 probe) biotinylated BUI 
RNA and precipitated with ethanol. BU1/B95-8 probes were hybridized with 
an 8 fold excess (2 mg/ml) of biotinylated BUI RNA; while BUI control 
probes were hybridized with a 2 fold excess (0.5 ng/ml) of biotinylated BUI 
25 RNA. Hybridizations and subtractions were performed using the "Subtracter" 

kit (Invitrogen, San Diego, CA) according to the manufacturer's instructions. 
The precipitated cDNA/RNA mixtures were resuspended in 10 to 20 pL H 2 0 
and heated to 100°C for 1 min. An equal volume of 2X hybridization buffer 
(Invitrogen) was added and the mixture was incubated at 65°C for 20 to 24 h. 
30 Following addition of an equal volume of HEPES buffer (10 mM HEPES, pH 

7.5, 1 mM EDTA), 20 fig streptavidin was added and the mixture was 
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incubated on ice for 10 min. Biotinylated RNA and RNA:cDNA duplexes, 
complexed with avidin, were removed by repeated phenol/chloroform 
extractions. Hie single stranded, subtracted BUI cDNA probe which 
remained in the aqueous phase was used directly for in situ filter 

5 hybridizations. Aqueous phase BL41/B95-8 cDNA probe was precipitated 

with ethanol and subjected to a second round of subtraction under identical 
conditions prior to use in filter hybridizations. Duplicate filters were made 
from 145 mm plates containing 6000 recombinant bacteriophage and were 
hybridized in parallel to equal amounts of BL41/B95-8 or BL41 subtracted 

10 probes. Filters were hybridized at 48°C for 48 to 72 h in a buffer consisting 

of 50% formamide, 6X SSPE, 1% SDS, 10% dextran sulfate, 2X Denhardt's 
solution, 100 /xg/ml sheared single-stranded herring testis DNA, and 10 /ig/ml 
poly rA:rU (Sigma, St. Louis, MO). Filters were washed at 72°C in 0.2X 
SSC and exposed 3 to 7 days to preflashed film (Kodak X-OMAT AR). 

15 Differentially expressed genes were identified by overlaying films from 

corresponding filters. Clones selected on primary screening were rescreened 
once at low density to verify differential expression and for plaque 
purification. 

Analysis of clones. DNA was extracted from bulk liquid cultures of 
20 purified lambda gtlO clones and digested with EcoRI. cDNA inserts were 

purified by agarose gel electrophoresis and subcloned into pBluescript (+). 
Nucleotide sequences were determined and were compared by the BLAST 
algorithm (Altschul, S.F., et «L. J. ML Biol. 2!J(3J:403-10 (1990)) with 
known sequences resident in the National Center for Biotechnology 
25 Information databases using the Experimental GENINFO® BLAST Network 

Service, accessed through the Molecular Biology Computer Research Resource 
of the Dana-Farber Cancer Institute. Multiple sequence alignments , were 
performed by the method of Higgins and Sharp (Higgins and Sharp, Gene 
73(l):237-44 (1988)) using the CLUSTAL program (PCGene, IntelliGenetics, 
30 Mountain View, CA) with open gap and unit gap costs of 10. 
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EXAMPLE 1 
Identification ofcDNA Clones 
of EBV Induced RNAs by Subtracted Probe Hybridization 

CDNA clones of RNA from an in vitro EBV-infected BL cell line, 
5 BU1/B95-8 [EBV(+) BUI], were differentially screened with an EBV(+) 

BL41 cDNA probe from which sequences complementary to EBV(-) BL41 cell 
RNA had been specifically removed, and with an EBV(-) BL41 control cDNA 
probe. Sequences complimentary to EBV(-) BL41 RNA were removed from 
the EBV(+) BUI RNA cDNA probes by two subtractions with an 8 fold 
10 excess of biotinylated EBV(-) BUI RNA. Overall, 85-95% of the labeled 

EBV(+) BUI probe was removed by the two subtractions. EBV(-) BUI 
cDNA control probe was subtracted only once, removing 60-85% of the 
probe; thereby reducing hybridization to plaques containing cDNAs from 
abundant RNAs so mat hybridization to cDNAs from less abundant BUI 

15 RNAs was evident. 

Seventy-five phage cDNA clones differentially hybridized to the 

EBV(+) BUI probe on the first screen of 75,000 recombinant phage. 

Twenty-five clones were consistentiy positive on rescreening. The eighteen 

clones which demonstrated the greatest reactivity with the EBV(+) versus the 
20 EBV(-) BUI cDNA probes were selected for nucleotide sequencing and RNA 

blot hybridization. 

EXAMPLE 2 
Nucleotide Sequences of EBV Induced cDNAs 

The first 12 clones are described in Table 1. Ten clones matched 7 
25 previously characterized genes: two independent clones each of the 

complement receptor type 2 (CD21), the serglycin proteoglycan core protein 
and vimentin; and one clone each of cathepsin H, annexin VI (p68), the 
myristylated alanine-rich protein kinase C substrate (MARCKS) and the 
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lymphocyte hyaluronic acid receptor (CD44). The 2.6 kb MARCKS cDNA 
precisely matched the previous 1.58 kb human MARCKS cDNA clone 
(Harlan, D.M., etaU h Biol. Chan. 266(22): 14399-405 (1991)) at its 5 
prime end. The 3 prime untranslated region of the new clone is highly 
5 homologous to bovine MARCKS cDNA (Stumpo, D.J., et al., Proc. Natl. 

Acad. Sci. USA 867^:4012-6 (1989)). 

The two remaining clones are from novel RNAs, EBVinduced genes 
1 (EBI 1) and 2 (EBI 2), whose nucleotide sequences can be predicted to 
encode G-protein coupled peptide receptors. The complete nucleotide and 
10 deduced amino acid sequences of the EBI 1 and EBI 2 cDNAs are shown in 

Figures 1A and IB, respectively. Because the first EBI 1 cDNA was 1.2 kb, 
significantly shorter than the 2.4 kb RNA, 20 other cDNA clones were 
obtained using the initial cDNA as a probe. The largest clone is 2153 
nucleotides (nt) and has a 1134 nt open-reading frame (Figure 1A). This 
15 clone is probably nearly full length, since it is close to the expected size, 

considering it has only a short poly A tail. Translation is likely to initiate 
from either of two AUGs, at nt 64-66 or 82-84, the first of which conforms 
to a consensus translational initiation sequence (Kozak, M., J. Biol. Chan. 
266(30)'- 19867-70 (1991)). An in-frame stop codon at nt 10-12 is consistent 
20 with downstream initiation at nt 64-66. The polypeptide encoded by the 

sequence beginning at nt 64 has a predicted molecular weight of 42.7 kD and 
includes eight hydrophobic domains likely to mediate membrane insertion. 
The first hydrophobic domain begins at the amino terminus and ends at a 
predicted signal peptidase cleavage site. The 7 remaining hydrophobic 
25 domains are characteristic of the G-protein coupled receptor family. Potential 

asparagine linked glycosylate sites are present in the extracellular amino 
terminal segment and in the third extracellular loop. 

Since the initial EBI 2 cDNA was 1643 nt and approximated the size 
expected from a 1.9kb polyadenylated RNA, further cDNA clones were not 
30 obtained. The EBI 2 cDNA contains a 1083 nt open reading frame with two 

methionine codons are at nt 34-36 and 4648 (Figure IB). Although neither 
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methionine codon is in a favored initiation context (Kozak, M., J. Biol. Chem. 
2<56730j: 19867-70 (1991)), an upstream, in-frame termination codon and the 
absence of other potential open reading frames is consistent with translation 
initiating at the first or second methionine codon. Initiation at the first would 

5 result in a 41.2 kD protein. The deduced amino acid sequence predicts 7 

hydrophobic transmembrane segments in the characteristic configuration of G- 
protein coupled receptors. In contrast to the EBI 1 protein, EBI 2 lacks a 
signal peptide. A possible N-linked glycosylate site is found in the amino 
terminal extracellular domain. Though the EBI 2 cDNA lacks a polyadenylate 

10 tail, a canonical polyadenylation signal (AATAAA) near the 3 prime end is 

consistent with the cDNA being essentially complete. 

EXAMPLE 3 

Comparison of EBI land 2 with other G protein coupled receptors 

The EBI 1 and EBI 2 nucleotide and predicted amino acid sequences 
15 were compared with the Genbank (release 72 and updates), EMBL (release 

31), Genbank translation, Swiss protein (release 22) and Protein Identification 
Resource (PIR, release 33) databases, using the BLAST algorithm (Altschul, 
S.F., etal., J. Mol. Biol. 275(3j:403-10 (1990)). EBI 1 and EBI 2 are 
homologous to G protein associated receptors. EBI 1 is highly homologous 
20 to the human high or low affinity interleukin 8 (IL-8) receptors at both the 

nucleotide (data not shown) and amino acid sequence levels. IL8 receptor 
itself is not expressed on lymphocytes (Holmes, W.E., etal., Science 
253(5025):127S-S0 (1991); Murphy and Tiffany, et al., Science 253:1280- 
1283 (1991)). Excluding the putative EBI 1 signal peptide, the overall amino 
25 acid identity among the 3 proteins exceeds 30%, with conservative changes 

observed at many of the non-identical residues. The identity increases to 40% 
when EBI 1 is compared with either IL-8 receptor individually. Additional 
similarities with the IL-8 receptors include a high proportion of serine and 
threonine near the carboxy terminus, and a highly acidic amino terminal 
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extracellular domain. The IL-8 receptor acidic residues are implicated in 
binding 11^8 basic amino acids (Holmes, W.E., etal.. Science 
253(5025):m*-W. (1991); Murphy and Tiffany, etal., Science 253:1280- 
1283 (1991). 

5 The EBI 2 gene does not have such a close homologue. EBI 2 has 

24% amino acid identity to the thrombin receptor (Vu, T.K., etal., Cell 
64(6): 1057-68 (1991)). Less extensive homologies are observed with a 
number of other G-protein coupled receptors, including the receptors for 
vasoactive intestinal polypeptide, somatostatin (type 1) and angiotensin II, as 
10 well as the low affinity IL-8 receptor. EBI 2 also exhibits more distant 

homologies with EBI 1 and the high affinity IL-8 receptor. Significantly, 
these are the same proteins which, in different order, exhibit the closest 
homologies with the EBI 1 protein. Together they constitute a subfamily of 
G-protein coupled peptide receptors. The greatest conservation of residues 
15 among these proteins extends from the first transmembrane domain to the 

second intracellular loop. Because of the particular conservation of an amino 
acid sequence among these G protein coupled receptors, we are able to 
identify a new highly conserved sequence motif at the carboxy end of TM III 
and the adjacent second intracellular loop. This motif, S-(I/L)-D-R-(Y/F)-X- 
20 X-X-X, with x being a hydrophobic amino acid, is in a wide variety of G- 

protein coupled receptors; and is not in other proteins in the data bases 
surveyed. Other highly conserved features of G protein coupled receptors in 
EBI 1 and 2 include the asparagine in TM I, the proline in TM II, the 
aspartate in the first intracellular loop, and the tryptophane and cysteine in the 
25 first extracellular loop. This cysteine has been postulated to be involved in 

disulfide linkage to a conserved cysteine present in the second extracellular 
loop in several other receptors, including the beta adrenergic and thrombin 
receptors. 
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EXAMPLE4 

Analysis of induced gene expression by RNA blot hybridization 

Probes from seven of the nine EBV induced cDNAs were hybridized 
to identical blots of polyadenylated RNA from the EBV(+) or EBV(-) BUI 
5 cell lines or from the EBV transformed lymphoblastoid cell line, IB4 

(Figure 2). Vimentin and CD21 were previously shown to be EBV induced 
and were not further evaluated. The RNAs loaded in the EBV(+), BL41. and 
EBV(-) BUI lanes were standardized with respect to beta actin reactivity. 
Significantly less 1B4 cell RNA was used due to the high abundance of the 
10 putative induced gene RNAs in these cells (Figure 2, Actin probe). Probes 

from each of the cDNA clones detected RNAs which are significantly more 
abundant in both IB4 and EBV(+) BUI cells than in EBV(-) BUI cells. 
Induction factors indicated in Table 1 were determined by quantitative 
densitometric scanning of autoradiographs and reflect the fold enhancement of 
15 signal intensities in EBV(+) BUI cells compared with EBV(-) BUI cells, 

corrected for the ratio of actin reactivities. Standardization by actin reactivity, 
however, significantly underestimates the absolute induction levels since actin 
is induced 3-fold by EBV infection of BUI cells relative to glyceraldehyde 
phosphate dehydrogenase, (GAPDH), or to total RNA amounts quantitated 
20 spectrophotmetrically. To achieve equal actin signal intensities, 3-fold more 

EBV(-) BUI than EBV(+) BUI RNA was loaded per lane. Importantly, 
each of the RNAs was at least as abundant in IB4 cells relative to GADPH as 
in EBV(+) BUI (Figure 2). 

EBI 1, EBI 2, CD44 and MARCKS are the most induced of the seven 
25 genes. The CD44 gene encodes three distinct RNAs of 1.6, 2.2 and 4.8 kb 

respectively in both IB4 and EBV(+) BUI cells. No CD44 RNA was 
detected EBV(-) BUI cells even after prolonged autoradiographic exposures. 
EBI 2 RNA was also undetectable in EBV(-) BUI cells. 
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EXAMPLE 5 

Expression of ESI land 2 in human cell tines and tissues 

The expression of EBI 1 and 2 in human cell lines and tissues was 
evaluated by hybridizing actin, EBI 1 or EBI 2 probes to blots of cell line or 

5 tissue RNAs. While EBI 1 is weakly expressed in BL41, EBI 2 is not; and, 

neither EBI 1 nor EBI 2 are expressed in another EBV(-) BL cell line, BL30 
(Figure 3). EBI 1 and EBI 2 RNAs are abundant in primary human 
lymphocytes transformed by EBV in vitro and propagated as continuous 
lymphoblastoid cell lines for several years (IB4) or for less than 1 year (W91- 

10 LCL) (Figure 3). EBI 1 RNA is faintly detectable in the human T cell line 

Jurkat, and is abundantly expressed in a second T cell line, HSB-2 (Figure 3). 
EBI 2 RNA is not detected in either T cell line (Figure 3), nor in a third 
T cell line, MOLT-4. EBI 1 is not expressed in the human promyelocytic 
line, HL60, the chronic myelogenous leukemia cell line K562, the epithelial 

15 cell line, RHEK-1, the fibroblast-like osteosarcoma cell line, TK143, or the 

monocytic cell line, U937 (Figure 3). EBI 2, however, is expressed weakly, 
relative to actin, in HUSO, U937 (U937 RNA is partially degraded) or HeLa 
cells (Figure 3). 

EBI 1 and 2 RNAs are abundant in human spleen, somewhat less abundant 
20 relative to actin in tonsil and are not detectable in bone marrow (Figure 3). 

Both genes were expressed in resting PBMCs at levels comparable to IB4 or 
LCL-W91 B lymphoblastoid cells (Figure 3). Expression increased in parallel 
cultures stimulated for 72 h with pokeweed mitogen (PWM), although actin 
expression also increased with PWM (Figure 3). The EBI 1 and 2 RNA in 
25 stimulated and non stimulated PBMC cultures is likely to be mostly in B 

lymphocytes since EBI 1 RNA is at low levels and EBI 2 RNA is absent from 
phytohemagglutinin stimulated, PBMC derived, T lymphocytes (Figure 3). 
These findings are consistent with expression patterns observed in T cell lines. 
EBI 1 and 2 RNA levels were also evaluated in a variety of non- 
30 hematopoietic human tissues. The EBI 1 probe detects.small amounts of RNA 
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in both lung and pancreas (Figure 4). Rehybridization of this blot with an 
immunoglobulin mu chain probe (Figure 4, Igu probe) indicated that these 
tissue preparations contained significant amounts of immunoglobulin RNA, 
probably due to B lymphocytes in the tissues. Since EBI 1 RNA is abundant 
5 in peripheral blood lymphocytes, the EBI 1 RNA in the lung and pancreas » 

likely to be due to B lymphocytes. Similarly, the low level of EBI 2 RNA 
detected in pancreatic tissue is probably due to infiltrating B lymphocytes 
(Figure 4). However, the abundance of EBI 2 RNA in the lung is too great 
to attribute to lymphocyte contamination and is more likely due to specific 
10 expression in pulmonary epithelial cells or macrophages (Rgure 4). 

EXAMPLE 6 
" Ooning and Characterization of EBI 3 

Subtracts hybridization screening of a BU1/B95-8 cDNA library has 
permitted the identification of a number of genes expressed at higher levels in 
15 EVB-infected BL cells compared with matched EBV(-) cells. Twenty-five 

putative EBV-induced gene clones were initially isolated. Ofthese, 13clones 
matched 8 previously known genes. The remaining 12 clones represented 10 
novel genes. Two of these clones were derived from transcripts of a 
previously uncharacterized gene designated EBV-induced gene 3 (EBI 3). 

20 The complete nucleotide and amino acid sequence of the larger EBI 3 

clone are shown in Figure 5 (SEQ ID NO* and SEQ ID NO:6, respectively). 
The 1182 nucleotide cDNA contains a 690 nucleotide open reading frame. A 
unique AUG codon preceding this reading frame at nucleotides 14-16 
conforms to the Kozak consensus translational initiation sequence. Initiation 

25 from this site results in the synthesis of a polypeptide with a predicted 

molecular mass of 25,380 Daltons. The first 20 amino acids are highly 
hydrophobic and likely form a signal peptide for membrane translocation with 
a predicted signal peptidase cleavage site following a glycine residue at 
position 20. Two potential asparagine-linked glycosylation sites are also 
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identified. However, no other hydrophobic segments capable of forming the 
• transmembrane domain of an integral membrane protein are evident To 
verify the structure of this cDNA, five additional clones were retrieved from 
the library. All of these exhibited identical sequences throughout the putative 
5 carboxy-terminal portion of the predicted protein. The 3' end of the EBI 3 

nucleotide sequence is notable for its homology to the left monomer of the 
human Alu repeat element This homology extends to and includes the A-rich 
sequences which immediately precede the polyadenylate tail of the mRNA. 
The EBI 3 nucleotide and amino acid sequences were compared with 
l0 all known sequences of Genbank nucleic acid, and Genbank translation. 

Protein Identification Resource (PIR) and Swiss Protein databases, 
respectively, using the Experimental GENINFO(R) BLAST-server network of 
the National Center for Biotechnology Information. No significant nucleotide 
homologies were observed, excluding matches with the 3' untranslated Alu 
15 repeat. However, the predicted EBI 3 protein is approximately 30% identical 

to the receptor for ciliary neurotrophic factor (CNTF), with conservative 
amino acid changes at many of the non-identical residues. Of particular 
significance is the pattern of conserved residues which include 4 cysteines at 
positions 35, 46, 80 and 90 respectively of the complete EBI 3 protein 
20 sequence; tryptophanes at positions 48 and 150; proline at position 125; and 

aliphatic hydrophobic residues at positions 128, 136, 148 and 204. In 
addition, the EBI 3 sequence LSDWS at residues 215 to 219 closely matches 
the WSDWS sequence of the CNTF receptor. These conserved structural 
features are characteristic of and unique to members of the cytokine receptor 
25 family. The predicted EBI 3 protein exhibits less extensive homologies with 

the p40 subunit of interleukin 12 (IL-12), also known as natural killer cell 
. stimulatory factor. Though a secreted protein, IH2 p40 possesses the same 
conserved residues and is also a member of the cytokine receptor family. In 
addition, the carboxy terminal 100 amino acids of the EBI 3 protein exhibit 
30 structural homologies with type III fibronectin domains of a variety of 

adhesion related molecules, including tenascin, cytotactin and the neural cell 
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adhesion molecule, NCAM. This feature has also been described among other 
cytokine receptor family members. 

Hybridization of a 3 2 P-labeled EBI 3 probe to RNA blots detects a 1.5 
kb RNA in the EBV-infected cell lines IB4 and BL41/B95-8 (Figure 6). EBI 3 
RNA is undetectable, however, in the EVB(-) control cell line BLtt. To 
provide standards for the amounts of RNA loaded in each line, parallel blots 
were hybridized with probes for glyceraldehyde phosphate dehydrogenase 
(GAPDH) and actin. These probes indicate that the BUI lane contains as 
much or more RNA than the EBV-infected cell lanes. 

Examination of a series of human cell lines and lymphoid tissues 
indicated that EBI 3 is expressed at very low levels in normal unfractionatcd 
resting lymphocytes of spleen and tonsil/but is undetectable in peripheral 
blood mononuclear cells (PBMC). However, stimulation of PBMC with the 
B and T lymphocyte activating agent, pokeweed mitogen, results in induction 
of the EBI 3 mRNA. Lower levels of EBI 3 RNA were detected in 
phytohemagglutinin stimulated peripheral blood T lymphocytes. In addition 
to IB4 and BL41/B95-8 cells, a recently established lymphoblastoid cell line 
transformed with the W91 EBV strain also exhibited significant EBI 3 
expression. EBI 3 RNA was undetectable in a second EBV(-) BL cell line, 
BL30, in BL41 cells infected with the non-transforming P3HRI EBV strain, 
and in all human myeloid, T lymphoid or epithelial cell line examined. 

Expression of EBI 3 RNA was also analyzed in a variety of non- 
lymphoid human tissues (Figure 7A). Abundant expression was observed in 
placenta, significantly exceeding expression levels observed in any lymphoid 
25 cell type. EBI 3 RNA was also faintly detectable in liver RNA. However, 

rehybridization of this blot with an immunoglobulin , heavy chain probe 
indicated detectable Ig gene expression, probably due to infiltration of liver 
tissue with lymphocytes in vivo. The apparent expression of EBI 3 in liver 
could therefore be due to expression in resident lymphocytes. 



15 
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All publications mentioned hereinabove are hereby incorporated in their 

entirety by reference. 

While the foregoing invention has been described in some detail for 
purposes of clarity and understanding, it will be appreciated by one skilled in 
the art from a reading of this disclosure that various changes in form and 
detail can be made without departing from the true scope of the invention and 
appended claims. 
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Table 1 




1 Induction levels were calculated as ratio of signal intensities 
(BU1/B95-8 to BL41) for individual probes, divided by ratio of signal 
intensities for Actin probe. 

2 The 1 2 kb EBI 1 clone identified on initial screen was incomplete. 
Rescreehing of the cDNA library resulted in eolation of several 
additional full-length clones, the largest of which was 2.14 kb. 

3 Induction of beta actin RNA was calculated as ratio of actin signal 
intensities, to ratio of signal intensities for glyceraldehyde phosphate 
dehydrogenase probe. 
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SEQUENCE LISTING 



<1) GENERAL INFORMATION: 

(i) APPLICANT: Birkenbach, Mark 
Kieff, Blliot 

(ii) TITLE OF INVENTION: Epstein Barr Virus Induced Gen 
(iii) NUMBER OF SEQUENCES : 6 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Sterne, Keseler, Goldstein & Fox 

(B) STREET: 1100 New York Avenue N.W., Suite 600 

(C) CITY: Washington 

(D) STATE: D.C. 

(E) COUNTRY: U.S.A. 
{F} ZIP: 20005-3934 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS /MS -DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1, 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT (to be assigned) 

(B) FILING DATE: herewith 

(C) CLASSIFICATION: 

tix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (202) 371-2600 

(B) TELEFAX: (202) 371-2540 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2154 base pairs 

(B) TYPB: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 64.. 1197 
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(xi) SEQUENCE DESCRIPTION : SEQ ID KO:l: 
GGAATTCCGT AGTSCGAGGC CGGGCACAGC CTTCC^T GGTTTTACCG C«*GAGAGC 

GTC ATG GAC CTG GGG AAA CCA ATG AAA AGC GTG CTO GTG GTG OCT CTC 
Z Asp Leu Gly Lys Pro Met Lye *~ Val Leu Val Val Ala Leu 



1 



CTT GTC ATT TTC CAO GTA TGC CTG TOT CAA GAT GAG GTC ACQ GAC GAT 
Z Z m 1 Gin va! Cys l*u C*o Gin Asp Glu Val Thr Asp Asp 



20 



TAC ATC GGA GAC AAC ACC ACA GTG GAC TAC ACT TTG TTC GAG TCT TTG 
Z Z Gly Asp Aan Thr Thr Val Asp Tyr Thr Leu Phe Glu Ser Leu 
35 40 * 

TO C TCC AAG AAG GAC GTG CGG AAC TTT AAA GCC TGG TTC CTC CCT ATC 
Z Ser Lys Lye Asp Val Arg Asn Phe Lys Ala Trp Phe Leu Pro lie 

50 55 

ATO TAC TCC ATC ATT TGT TTC GTG GGC CTA CTG GGC AAT GGG CTG GTC 
2 Z ser lie He Cys Phe Val Gly Leu Leu Gly Aen Gly I*u Val 

65 70 " 

„„, - cc raT ATC TAT TTC AAG AGG CTC AAG ACC ATG ACC GAT ACC 

Val Leu Thr Tyr lie Tyr Phe Lys Arg Leu Lys Thr Met Thr Asp Thr 



85 90 
80 85 



TAC CTG CTC AAC CTG GCG GTG OCA GAC ATC CTC TTC CTC CTG ACC CTT 
Z Z Z L Leu Ala Val Ala Asp lie Leu Phe Leu Leu Thr Leu 



100 « S 



CCC TTC TGG GCC TAC AGC GCG GCC AAG TCC TGG GTC TTC GGT GTC CAC 
Pro Phe Trp Ala Tyr Ser Ala Ala Lys Ser Trp Val Phe Gly Val His 



115 120 



TTT TGC AAG CTC ATC TTT GCC ATC TAC AAG ATG AGC TTC TTC AGT GGC 
Z Z Z Leu lie Phe Ala He Tyr Lys Met Ser Phe Phe Ser Gly 
130 « 5 140 

U» CTC CTA CTT CTT TGC ATC AGC ATT GAC CGC TAC GTG GCC ATC GTC 
CTC CTA CTT ti asp Arg Tyr Val Ala He Val 

Met Leu Leu Leu Leu Cys He Ser He Asp Arg xy 

145 "° " 5 

CAG OCT GTC TCA GCT CAC CGC CAC CGT GCC CGC GTC CTT CTC ATC AGC 
L Ala Val Ser Ala His Arg His Arg Ala Arg Val Leu Leu He Ser 



160 



165 



170 



MG CTG TCC TGT GTG GGC AGC GCC ATA CTA GCC ACA GTG CTC TCC ATC 
Z lTu ser cys Val Gly Ser Ala lie Leu Ala Thr Val Leu Ser He 



60 



108 



156 



204 



252 



300 



348 



396 



444 



492 



540 



588 



636 



180 



185' 
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CCA GAG CTC CTG TAG AGT GAC CTC GAG AGG AGC AGC AGT GAG CAA GCG 684 
Pro Glu Leu Leu Tyr Ser Aep Leu Gin Arg Ser Ser Ser Glu Gin Ala 
195 200 205 

ATG CGA TGC TCT CTC ATC ACA GAG CAT GTG GAG GCC TTT ATC ACC ATC 
Met Arg Cye Ser Leu He Thr Glu His Val Glu Ala Phe lie Thr lie 
210 215 220 

CAG GTG GCC CAG ATG GTG ATC GGC TTT CTG GTC CCC CTG CTC GCC ATG 
Gin Val Ala Gin Met Val lie Oly Phe Leu Val Pro Leu Leu Ala Met 
225 230 235 

AGC TTC TGT TAC' CTT GTC ATC ATC CGC ACC CTG CTC CAG GCA CGC AAC 
Ser Phe Cye Tyr Leu Val He He Arg Thr Leu Leu Gin Ala Arg Aen 
240 245 250 255 

TTT GAG CGC AAC AAG GCC ATC AAG GTG ATC ATC GCT GTG GTC GTG GTC 
Phe Glu Arg Aen Lye Ala He Lye Val He He Ala Val Val Val Val 
260 265 270 

TTC ATA GTC TTC CAG CTG CCC TAC AAT GGG GTG GTC CTG GCC CAG ACQ 
Phe He Val Phe Gin Leu Pro Tyr Aen Gly Val Val Leu Ala Gin Thr 
275 280 285 

GTG GCC AAC TTC AAC ATC ACC AGT AGC ACC TGT GAG CTC AGT AAG CAA 
Val Ala Asn Phe Asn He Thr Ser Ser Thr Cye Glu Leu Ser Lye Gin 
290 295 300 

CTC AAC ATC GCC TAC GAC GTC ACC TAC AGC CTG GCC TGC GTC CGC TGC 
Leu Asn He Ala Tyr Aep Val Thr Tyr Ser Leu Ala Cys Val Arg Cys 
305 31° 315 

TCC GTC AAC CCT TTC TTG TAC GCC TTC ATC GGC GTC AAG TTC CGC AAC 
Cys Val Asn Pro Phe Leu Tyr Ala Phe He Gly Val Lye Phe Arg Asn 
320 325 330 335 

GAT ATC TTC AAG CTC TTC AAG GAC CTG GGC TGC CTC AGC CAG GAG CAG 
Asp He Phe Lye Leu Phe Lys Aep Leu Gly Cys Leu Ser Gin Glu Gin 
340 345 350 

CTC CGG CAG TGG TCT TCC TGT CGG CAC ATC CGG CGC TCC TCC ATG AGT 1164 
Leu Arg Gin Trp Ser Ser Cye Arg His He Arg Arg Ser Ser Met Ser 
355 360 365 

GTC GAG GCC GAG ACC ACC ACC ACC TTC TCC CCA TAGGOGACTC TTCTGCCTGG 
Val Glu Ala Glu Thr Thr Thr Thr Phe Ser Pro 
370 375 

ACTAGAGGGA CCTCTCCCAG GGTCCCTGGG GTGGGGATAG GGAGCAGATG CAATGACTCA 1277 
GGACATCCCC CCGCCAAAAG CTGCTCAGGG GAAAAAGCAG CTCTCCCCTC AGAGTGCAAG 1337 



732 



780 



828 



876 



924 



972 



1020 



1068 



1116 



1217 
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CCCCTGCTCC AGAAGATAGC TTCACCCCAA TCCCAGCTAC CTCAACCAAT GCCAAAAAAA 1397 

GACAGGGCTG ATAAGCTAAC ACCAGACAGA CAACACTGGG AAACAGAGGC TATTGTCCCC 1457 

XAAACCAAAA ACTGAAAGTG AAAGTCCAGA AACTGTTCCC ACCTOCTGGA GTOAAOGGGC 1517 

CAAGGAGGGT GAGTGCAAGG GGCGTGGGAG TGGCCTGAAG AGTCCTCTGA ATGAACCTTC 1577 

TGGCCTCCCA CAGACTCAAA TGCTCAGACC AOCTCTTCCG AAAACCAGGC CTTATCTCCA 1637 

AGACCAGAGA TAGTGGGGAG ACTTCTTGGC TTGGTGAGGA AAAGCGGACA TCAGCTGGTC 1697 

AAACAAACTC TCTGAACCCC TCCCTCCATC GTTTTCTTCA CTGTCCTCCA AGCCAGCGGG 1757 
AATGGCAGCT GCCACGCCGC CCTAAAAGCA CACTCATCCC CTCACTTGCC GOGTCGCCCT 1817 
CCCAGGCTCT CAACAGGGGA GAGTGTGGTG TTTCCTGCAG GCCAGGCCAG CTGCCTCCGC 1877 
GTGATCAAAG CCACACTCTG GGCTCCAGAG TGGGGATGAC ATGCACTCAG CTCTTGGCTC 1937 
CACTGGGATG GGAGGAQAGG ACAAGGGAAA TGTCAGGGGC GGGGAGGGTG ACAGTGGCCG 1997 
CCCAAGGCCA CGAGCTTGTT CTTTGTTCTT TGTCACAGGG ACTGAAAACC TCTCCTCATG 2057 
TTCTGCTTTC GATTCGTTAA GAGAGCAACA TTTTACCCAC ACACAQATAA AGTTTTCCCT 2117 
TGAGGAAACA ACAGCTTTAA AAAAAAAAAA GGAATTC 



(2) INFORMATION FOR SBQ ID NO: 2: 

(i) SBQUENCB CHARACTERISTICS: 

(A) LENGTH: 378 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
Met Asp Leu Gly Lye Pro Met Lye Ser Val Leu Val Val Ala Leu Leu 



IS 

1 



5 " 



Val lie Phe Gin Val Cys Leu Cys Gin Asp Glu Val Thr Asp Asp Tyr 
20 25 *0 



lie Gly Asp Asn Thr Thr Val Asp Tyr Thr Leu Phe Glu Ser Leu Cys 
35 

Ser Lys Lys Asp Val Arg Asn Phe Lys Ala Trp Phe Leu Pro lie Met 
50 " " . 



2154 
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Tyr Ser 



He He Cys Phe Val Gly Leu Leu Qly Aon Gly Leu Val Val 



65 



70 



75 



80 



Leu 



Thr Tyr He Tyr Phe Lye Arg Leu Lye Thr Met Thr Asp Thr Tyr 



85 



90 



95 



Le U Leu Aen Leu Ala Val Ala Asp lie Leu Phe Leu Leu Thr Leu Pro 

105 11° 



100 



Phe Trp Ala Tyr Ser Ala Ala Lys Ser Trp Val Phe Gly Val Hie Phe 
115 "0 "5 

eye Lys Leu He Phe Ala lie Tyr Lye Met Ser Phe Phe Ser Gly Met 
130 "5 1« 

Leu Leu Leu Leu Cys lie Ser lie Asp Arg Tyr Val Ala He Val Gin 
145 

Ala Val Ser Ala His Arg His Arg Ala Arg Val Leu Leu lie Ser Lys 
165 "0 1" 

Leu Ser Cys Val Gly Ser Ala He Leu Ala Thr Val Leu Ser He Pro 
1B0 185 »0 

Glu Leu Leu Tyr Ser Asp Leu Gin Arg Ser Ser Ser Glu Oln Ala Met 

195 200 205 

Arg Cys Ser Leu He Thr Glu His Val Glu Ala Phe He Thr He Gin 
210 215 220 

Val Ala Gin Met Val He Gly Phe Leu Val Pro Leu Leu Ala Met Ser 



225 



230 



235 



240 



Phe Cys Tyr Leu Val He He Arg Thr Leu Leu Gin Ala Arg Aen Phe 



245 



250 



255 



Glu Arg Asn Lys Ala He Lys Val lie He Ala Val Val Val Val Phe 
260 265 270 



He Val Phe Gin Leu Pro Tyr Asn Gly Val Val Leu Ala Gin Thr Val 

280 285 



275 



Ala Asn 



Phe Asn He Thr Ser Ser Thr Cys Glu Leu Ser Lys Gin Leu 



290 295 300 



Asn He Ala Tyr Asp Val Thr Tyr Ser Leu Ala Cys Val Arg Cys Cys 

315 320 



305 



310 



Val Asn Pro Phe Leu Tyr Ala Phe He Gly Val Lys Phe Arg Asn Asp 
325 330 335 
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lie Phe Lye Leu Phe Lye Asp Leu Gly Cys Leu Ser Gin Glu Gin Leu 



340 



345 



350 



Arg Gin Trp Ser Ser Cys Arg Hie He Arg Arg Ser Ser Met Ser Val 
355 360 365 



Glu Ala Glu Thr Thr Thr Thr Phe Ser Pro 
370 375 

(2) INFORMATION FOR SBQ ID NO: 3: 

(i) SEQUKNCE CHARACTERISTICS: 

(A) LENGTH: 1643 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 34.. 1116 



(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 3: 

GGAATTCCCT GATATACACC TGGACCACCA CCA ATG GAT ATA CAA ATG GCA AAC 54 

Met Asp He Gin Met Ala Aen 
1 5 

AAT TTT ACT CCG CCC TCT GCA ACT CCT CAG GGA AAT GAC TGT GAC CTC 102 
Asn Phe Thr Pro Pro Ser Ala Thr Pro Gin Gly Asn Asp Cys Asp Leu 
10 15 20 

TAT GCA CAT CAC AGC ACQ GCC AGG ATA GTA ATG CCT CTG CAT TAC AGC ISO 
Tyr Ala His His Ser Thr Ala Arg He Val Met Pro Leu Hie Tyr Ser 
25 30 35 

CTC GTC TTC ATC ATT GGG CTC GTG GGA AAC TTA CTA GCC TTG GTC GTC 198 
Leu Val Phe He He Gly Leu Val Gly Asn Leu Leu Ala Leu Val Val 
40 45 50 55 

ATT GTT CAA AAC AGG AAA AAA ATC AAC TCT ACC ACC CTC TAT TCA ACA 246 
He Val Gin Asn Arg Lys Lys He Asn Ser Thr Thr Leu Tyr Ser Thr 
60 65 70 

AAT TTG GTG ATT TCT GAT ATA CTT TTT ACC ACG GCT TTG CCT ACA OGA 294 
Asn Leu Val He Ser Asp He Leu Phe Thr Thr Ala Leu Pro Thr Arg 
75 80 85 
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ATA GCC TAC TAT GCA ATG GGC TTT GAC TGQ AGA ATC GGA GAT GCC TTG 
He Ala Tyr Tyr Ala Met oly Phe Asp Trp Arg He Gly Asp Ala Leu 

TGT AGG ATA ACT GCG CTA GTG TTT TAC ATC AAC ACA TAT GCA GGT GTG 
Cye Arg lie Thr Ala Leu Val Phe Tyr He Asn Thr Tyr Ala Gly Val 
105 H° 115 

AAC TTT ATG ACC TGC CTG AGT ATT GAC CGC TTC ATT OCT GTG GTG CAC 
Asn Phe Met Thr Cye Leu Ser He Asp Arg Phe lie Ala Val Val Hi* 
120 130 135 

COT CTA CGC TAC AAC AAG ATA AAA AGG ATT GAA CAT GCA AAA GGC GTG 
Pro Leu Arg Tyr Asn Lye lie Lye Arg lie Glu His Ala Lye Gly Val 
140 1« I 50 



342 



390 



438 



486 



TGC ATA TTT GTC TGG ATT CTA GTA TTT GCT CAG ACA CTC CCA CTC CTC 
Cye lie Phe Val Trp lie Leu Val Phe Ala Gin Thr Leu Pro Leu Leu 
155 160 I 65 

ATC AAC COT ATG TCA AAG CAG GAG GCT GAA AGG ATT ACA TGC ATG GAG 
He Asn Pro Met Ser Lye Gin Glu Ala Glu Arg He Thr Cye Met Glu 
170 175 180 

TAT CCA AAC TTT GAA GAA ACT AAA TCT CTT CCC TGG ATT CTG CTT GGG 
Tyr Pro Asn Phe Glu Glu Thr Lye Ser Leu Pro Trp He Leu Leu Gly 
185 190 195 

GCA TGT TTC ATA GGA TAT GTA CTT CCA CTT ATA ATC ATT CTC ATC TGC 
Ala Cye Phe He Gly Tyr Val Leu Pro Leu He He He Leu He Cys 
200 205 210 215 

TAT TCT CAG ATC TGC TGC AAA CTC TTC AGA ACT GCC AAA CAA AAC CCA 
Tyr Ser Gin He Cye Cys Lye Leu Phe Arg Thr Ala Lys Gin Asn Pro 
220 225 230 

CTC ACT GAG AAA TCT GGT GTA AAC AAA AAG GCT CTC AAC ACA ATT ATT 
Leu Thr Glu Lye Ser Gly Val Aen Lys Lye Ala Leu Asn Thr He He 
235 240 245 

CTT ATT ATT GTT GTG TTT GTT CTC TGT TTC ACA CCT TAC CAT GTT GCA 
Leu He He Val Val Phe Val Leu Cys Phe Thr Pro Tyr His Val Ala 
250 255 260 

ATT ATT CAA CAT ATG ATT AAG AAG CTT CGT TTC TCT AAT TTC CTG GAA 
He He Gin His Met He Lys Lye Leu Arg Phe Ser Asn Phe Leu Glu 
265 270 275 

TGT AGC CAA AGA CAT TOG TTC CAG ATT TCT CTG. CAC TTT ACA GTA TGC 
Cys Ser Gin Arg His Ser Phe Gin He Ser Leu His Phe Thr Val Cys 
280 285 290. 295 



534 



582 



630 



678 



726 



774 



822 



870 



918 
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CTG ATC AAC TTC AAT TGC TGC ATG GAC CCT TTT ATC TAG TTC TTT GCA 966 
Leu Met Aen Phe Aen Cys Cys Met Asp Pro Phe lie Tyr Phe Phe Ala 
300 305 310 

TCT AAA GGG TAT AAG AGA AAG GTT ATG AGO ATG CTG AAA CGG CAA GTC 1014 
Cys Lys Gly Tyr Lye Arg Lye Val Met Arg Met Leu Lye Arg Gin Val 
315 



320 325 



1062 



1110 



AGT GTA TCG ATT TCT AGT GCT GTG AAG TCA GCC CCT GAA GAA AAT TCA 
Ser Val Ser He Ser Ser Ala Val Lys Ser Ala Pro Glu Glu Asn Ser 
330 335 340 

CGT GAA ATG ACA GAA ACG CAG ATG ATG ATA CAT TCC AAG. TCT TCA AAT 
Arg Glu Met Thr Glu Thr Gin Met Met He Hie Ser Lye Ser Ser Aen 
345 350 3S5 

GGA AAG TGAAATGGAT TGTATTTTGG TTTATAGTGA CGTAAACTGT ATGACAAACT 1166 

Gly Lys 

360 

TTGCAGGACT TCCCTTATAA AGCAAAATAA TTGTTCAGCT TCCAATTAGT ATTCTTTTAT 1226 

ATTTCTTTCA TTGGGCGCTT TCCCATCTCC AACTCGGAAG TAAGCCCAAG AGAACAACAT 1286 

AAAGCAAACA ACATAAAGCA CAATAAAAAT GCAAATAAAT ATTTTCATTT TTATTTGTAA 1346 

ACGAATACAC CAAAAGGAGG CGCTCTTAAT AACTCCCAAT GTAAAAAGTT TTGTTTTAAT 1406 

AAAAAATTAA TTATTATTCT TGCCAACAAA TGGCTAGAAA GGACTGAATA GATTATATAT 1466 

TGCCAGATGT TAATACTGTA ACATACTTTT TAAATAACAT ATTTCTTAAA TCCAAATTTC 1526 

TCTCAATGTT AGATTTAATT CCCTCAATAA CACCAATGTT TTGTTTTGTT TCGTTCTGGG 1586 

TCATAAAACT TTGTTAAGGA ACTCTTTTGG AATAAAGAGC AG G ATG CTG C GGAATTC 1643 

(2) INFORMATION FOR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 361 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 4: 

Met Asp He Gin Met Ala Asn Asn Phe Thr Pro Pro Ser Ala Thr Pro 
5 10 15 
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Gin Oly Aen Aep eye Asp Leu Tyr Ala Hie Hie Ser Thr Ala Arg lie 

25 30 



20 



Val Met Pro 
35 



Leu Hie Tyr Ser Leu Val Phe lie lie Gly Leu Val Oly 



40 



45 



ten lau leu Ala Leu Val Val He Val Oln Aen Arg Lye Lye lie Aen 



50 



55 



60 



Ser Thr Thr Leu Tyr Ser Thr Aen Leu Val lie Ser Asp He Leu Phe 

-i« 75 80 

65 70 7b 

Thr Thr Ala Leu Pro Thr Arg lie Ala Tyr Tyr Ala Met Gly Phe Asp 



65 



90 



95 



Trp Arg He Gly Asp Ala Leu Cys Arg lie Thr Ala Leu Val Phe Tyr 

105 HO 



100 



He Aen Thr Tyr Ala Gly Val Aen Phe Met' Thr Cys Leu Ser He Asp 



115 



120 



125 



Arg Phe He Ala Val Val His Pro Leu Arg Tyr Asn Lys He Lye Arg 



130 



135 



140 



He Glu His Ala Lys Gly Val Cys He Phe Val Trp He Leu Val Phe 
145 150 155 160 

Ala Gin Thr Leu Pro Leu Leu He Asn Pro Met Ser Lys Gin Glu Ala 



165 



170 



175 



Glu Arg He Thr Cys Met Glu Tyr Pro Asn Phe Glu Glu Thr Lys Ser 
180 185 190 

Leu Pro Trp He Leu Leu Gly Ala Cys Phe He Gly Tyr Val Leu Pro 

200 205 



195 



Leu 



He He He Leu lie Cya Tyr Ser Gin He Cys Cys Lys Leu Phe 



210 215 220 



Arg Thr Ala Lys Gin Asn Pro Leu Thr Glu Lys Ser Gly Val Aen Lys 

235 240 



225 



230 



Lys 



Ala Leu Asn Thr He He Leu He He Val Val Phe Val Leu Cys 



245 



250 



255 



Phe Thr Pro Tyr His Val Ala He He Gin His Met He Lys Lys Leu 

265 270 



260 



Arg 



Phe Ser Asn Phe Leu Glu Cys Ser Gin Arg His Ser Phe Gin He 



275 



280 



285 
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Ser Leu Hie Phe Thr Val Cya Leu Met Asn Phe Aaa Cye Cye Met Aep 
290 



295 30° 



Pro Phe lie Tyr Phe Phe Ala Cys Lye Oly Tyr Lye Arg Lye Val Met 
30S 

Arg Met Leu Lys Arg Gin Val Ser Val Ser He Ser Ser Ala Val Lye 
325 330 335 

Ser Ala Pro Glu Glu Aan Ser Arg Glu Met Thr Glu Thr Gin Met Met 

He His Ser Lys Ser Ser Asn Gly Lys 
355 3 60 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1164 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 14.. 703 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

GAATTCCGCA GCC ATG ACC CCG CAG CTT CTC CTG GCC CTT GTC CTC TGG 
Met Thr Pro Gin Leu Leu Leu Ala Leu Val Leu Trp 
1 5 ■ « 

GCC AGC TGC CCG CCC TGC AGT GGA AGG AAA GGG CCC CCA OCA GCT CTG 
Ala Ser Cys Pro Pro Cys Ser Gly Arg Lys Gly Pro Pro Ala Ala Leu 
15 20 25 

ACA CTG CCC CGG GTG CAA TGC CGA GCC TCT CGG TAC CCG ATC GCC GTG 
Thr Leu Pro Arg Val Gin Cys Arg Ala Ser Arg Tyr Pro He Ala Val 
30 35 40 

GAT TGC TCC TGG ACC CTG CCG CCT GCT CCA AAC TCC ACC AGC CCC GGT 
Asp Cys Ser Trp Thr Leu Pro Pro Ala Pro Asn Ser Thr Ser Pro Gly 
45 50 55 60 

GTC CGT GGA TTG CGA CGT ACA GGC TCG GCA TGG CTG CCC GGG GCC ACA 
Val Arg Gly Leu Arg Arg Thr Gly Ser Ala Trp Leu Pro Gly Ala Thr 



49 



97 



145 



193 



241 



65 



70' 75 
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GCG TGG CCC TGC CTG GAG CAG ACG CCA ACG TCC ACC AGC TGC ACC ATC 
Ala Trp Pro Cys Leu Gin Gin Thr Pro Thr Ser Thr Ser Cya Thr He 
80 85 90 



289 



ACG GAT GTC CAG CTG TTC TCC ATG GCT CCC TAC GTG CTC AAT GTC ACC 
Thr Asp Val Gin Leu Phe Ser Met Ala Pro Tyr Val Leu Aen Val Thr 
95 100 105 



337 



GCC GTC CAC CCC TGG GGC TCC AGC AGC AGC TTC GTG CCT TTC ATA ACA 
Ala Val Hie Pro Trp Gly Ser Ser Ser Ser Phe Val Pro Phe He Thr 
110 115 120 



385 



GAG CAC ATC ATC AAG CCC GAC CCT CCA GAA GGC GTG CGC CTA AGC CCC 
Glu His He He Lye Pro Asp Pro Pro Glu Gly Val Arg Leu Ser Pro 
125 130 135 140 



433 



CTC GCT GAG CGC CAC GTA CAG GTG CAG TGG GAG CCT CCC GGG TCC TGG 
Leu Ala Glu Arg Hie Val Gin Val Gin Trp Glu Pro Pro Gly Ser Trp 
145 150 ' 155 



481 



CCC TTC CCA GAG ATC TTC TCA CTG AAG TAC TGG ATC CGT TAC AAG CGT 
Pro Phe Pro Glu He Phe Ser Leu Lys Tyr Trp He Arg Tyr Lys Arg 
160 165 170 



529 



CAG GGA GCT GCG CGC TTC CAC CGG GTG GGG CCC ATT GAA GCC ACG TCC 
Gin Gly Ala Ala Arg Phe His Arg Val Gly Pro He Glu Ala Thr Ser 
175 180 185 



577 



TTC ATC CTC AGG GCT GTG CGG CCC CGA GCC AGO TAC TAC GTC CAA GTG 
Phe He Leu Arg Ala Val Arg Pro Arg Ala Arg Tyr Tyr Val Gin Val 
190 195 200 



625 



GCG GCT CAG GAC CTC ACA GAC TAC GGG GAA CTG AGT GAC TGG AGT CTC 
Ala Ala Gin Asp Leu Thr Asp Tyr Gly Glu Leu Ser Asp Trp Ser Leu 
205 210 215 220 



673 



CCC GCC ACT GCC ACA ATG AGC CTG GGC AAG TAGCAAGGGC TTCCOGCTGC 723 
Pro Ala Thr Ala Thr Met Ser Leu Gly Lys 
225 230 

CTCCAGACAG CACCTGGGTC CTCGCCACCC TAAGCCCCGG GACACCTGTT GGAGGGCGGA 783 

TGGGATCTGC CTAGCCTGGG CTGGAGTCCT TGCTTTGCTG CTG CTG AG CT GCOGGGCAAC 843 

CTCAGATGAC CGACTTTTCC CTTTGAGCCT CAGTTTCTCT AG CTG AG AAA TGGAGATGTA 903 

CTACTCTCTC CTTTACCTTT ACCTTTACCA CAGTGCAGGG CTGACTGAAC TGTCACTGTG 963 

AGATATTTTT TATTGTTTAA TTAGAAAAGA ATTGTTGTTG GGCTGGGCGC AG TGG AT CGC 1023 

ACCTGTAATC CCAGTCACTG GGAAGCCGAC GTGGGTGGGT AGCTTGAGGC CAGGAGCTCG 1083 
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AAACCAGTCC GGGCCACACA GCAAGACCCC ATCTCTAAAA AATTAATATA AATATAAAAT 1143 

1164 

AAAAAAAAAA AAAAGGAATT C 



(2) INFORMATION FOR SBQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 230 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Thr Pro Gin Leu Leu Leu Ala Leu Val Leu Trp Ala Ser Cye Pro 
! 5 10 15 • 

Pro Cye Ser Gly Arg Lys Gly Pro Pro Ala Ala Leu Thr Leu Pro Arg 
20 25 30 

Val Gin Cys Arg Ala Ser Arg Tyr Pro He Ala Val Asp Cys Ser Trp 
35 40 45 

Thr Leu Pro Pro Ala Pro Asn Ser Thr Ser Pro Gly Val Arg Gly Leu 
50 55 60 

Arg Arg Thr Gly Ser Ala Trp Leu Pro Gly Ala Thr Ala Trp Pro Cye 
„ _ 75 80 

65 70 75 

Leu Gin Gin Thr Pro Thr Ser Thr Ser Cys Thr He Thr Asp Val Gin 
85 ' 90 95 

Leu Phe Ser Met Ala Pro Tyr Val Leu Asn Val Thr Ala Val His Pro 
100 105 HO 

Trp Gly Ser Ser Ser Ser Phe Val Pro Phe He Thr Glu His He He 
115 120 I 25 

Lys Pro Asp Pro Pro Glu Gly Val Arg Leu Ser Pro Leu Ala Glu Arg 
130 135 140 

His Val Gin Val Gin Trp Glu Pro Pro Gly Ser Trp Pro Phe Pro Glu 
145 150 155 160 

He Phe Ser Leu Lys Tyr Trp He Arg Tyr Lys Arg Gin Gly Ala Ala 
165 170 I 75 



WO 94/12519 



-63- 



PCT/US93/09G36 



Arg Phe His Arg Val Gly Pro He Glu Ala Thr Ser Phe He Leu Arg 
180 185 190 



Ala Val Arg Pro Arg Ala Arg Tyr Tyr Val Gin Val Ala Ala Gin Asp 



195 



200 



205 



Leu Thr Asp Tyr Gly Glu Leu Ser Asp Trp Ser Leu Pro Ala Thr Ala 
210 215 220 



Thr Met Ser Leu Gly Lye 
225 230 



PCT/US93/09O6 
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WHAT IS CLAIMED IS: 

1. A DNA segment coding for a polypeptide having an amino acid 
sequence corresponding to a polypeptide selected from the group consisting of 
EB1 1, EBI 2, and EBI 3 polypeptides. 

2. The DNA segment according to claim 1 , wherein the DNA segment 
has a sequence selected from the group consisting of sequences set forth in 
SEQ ID NO:l, SEQ ID NO:3, and SEQ ID NO:5; or alleiic, mutant or 
species variation thereof. ^ 

3. The DNA segment according to claim 1 , wherein the DNA segment 
has an allelic variation of a sequence selected from the group consisting of 
sequences set forth in SEQ ID NO:l, SEQ ID NO:3, and SEQ ID NO:5. 

4. The DNA segment according to claim 1, wherein the DNA segment 
encodes an amino acid sequence selected from the group consisting of 
sequences set forth in SEQ ID NO:2, SEQ ID NO:4, and SEQ ID NO:6, or 
mutant or species variation thereof. 

5. The DNA segment according to claim 1, wherein the DNA segment 
has a sequence selected from the group consisting of sequences set forth in 
SEQ ID NO:l, SEQ ID NO:3, and SEQ ID NO:5. 

6. The DNA segment according to claim 1 , wherein the DNA segment 
encodes an amino acid sequence selected from the group consisting of 
sequences set forth in SEQ ID N0:2, SEQ ID NO:4, and SEQ ID NO:6. 

7. A substantially pure polypeptide having an amino acid sequence 
corresponding to a polypeptide selected from the group consisting of EBI 1, 
EBI 2, and EBI 3 polypeptides. 
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8. The polypeptide according to claim 7, wherein the polypeptide has 
an amino acid sequence selected from the group consisting of sequences set 
forth in SEQ ID NO:2, SEQ ID NO:4, and SEQ ID NO:6, or mutant or 
species variation thereof. 

9. A nucleic acid probe for the detection of the presence of Epstein 
Barr Virus in a sample comprising the DNA segment according to claim 1 or 
at least 20 contiguous nucleotides thereof. 

10. The nucleic acid probe according to claim 9, wherein the DNA 
segment has a nucleic acid sequence selected from the group consisting of 
sequences set forth in SEQ ID NO:l, SEQ ID NO:3, and SEQ ID NO:5, or 
at least 20 contiguous nucleotides thereof. 

11. The nucleic acid probe according to claim 9, wherein the probe 
encodes an amino acid sequence selected from the group consisting of 
sequences set forth in SEQ ID NO:2, SEQ ID NO:4, and SEQ ID NO:6, or 
at least 7 contiguous amino acids thereof. 

12. A method of detecting Epstein Barr virus in a sample comprising: 

a) contacting said sample with the nucleic acid probe according to 
claim 9, under conditions such that hybridization occurs, and 

b) detecting the presence of said probe bound to RNA. 

13. A kit detecting the presence of Epstein Barr virus in a sample 
comprising at least one container means having disposed therein the nucleic 
acid probe according to claim 9. 

14. A recombinant DNA molecule comprising, 5' to 3', a promoter 
effective to initiate transcription in a host cell and the DNA segment according 
to claim 1. 
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15. A cell that contains the DNA molecule according to claim 14. 

16. A recombinant DNA molecule comprising a vector and the DNA 
segment according to claim 1. 

17. A cell that contains the recombinant DNA molecule according to 
claim 16. 

18. A recombinant DNA molecule comprising a transcriptional region 
functional in a cell, a sequence complimentary to an RNA sequence encoding 
an amino acid sequence corresponding to the polypeptide of claim 7, and a 
transcriptional termination region functional in said cell. 

19. A cell that contains the recombinant DNA molecule according to 
claim 18. 

20. An antibody having binding affinity to a polypeptide having an 
amino acid sequence selected from the group consisting of EBI 1, EBI 2, and 
EBI 3 polypeptides, or a binding fragment thereof. 

21. The antibody according to claim 20, wherein said polypeptide has 
an amino acid sequence selected from the group of sequences set forth in SEQ 
ID NO:2, SEQ ID NO:4, and SEQ ID NO:6, or mutant or species variation 
thereof. 

22. The antibody according to claim 20, wherein said antibody is a 
monoclonal antibody. 

23. A method of detecting a polypeptide selected from the group 
consisting of EBI 1, EBI 2, EBI 3 in a sample, comprising: 
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a) contacting said sample with an antibody according to claim 20, 
under conditions such that immunocomplexes form, and 

b) detecting the presence of said antibody bound to said polypeptide. 

24. A diagnostic kit comprising: 

i) a first container means containing the antibody according to 

claim 20, and 

ii) second container means containing a conjugate comprising 
a binding partner of said monoclonal antibody and a label. 

25. A hybridoma which produces the monoclonal antibody according 
to claim 22, or binding fragment thereof. 

26. The hybridoma according to claim 25, wherein said polypeptide 
has an amino acid sequence selected from the group of sequences set forth in 
SEQ ID NO:2, SEQ ID NO:4, and SEQ ID NO:6, or mutant or species 
variation thereof, or at least 7 contiguous amino acids thereof. 
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GGAATTCCGT AGTGCGAGGC CGGGCACAGC CTTCCTGTGT GGTTTTACCG CCCAGAGAGC 60 



GTC ATG GAC CTG GGG AAA CCA ATG AAA AGC GTG CTG GTG GTG GCT CTC 108 
Mel Asp Leu Gly Lys Pro Met Lys Ser Vol Leu Vol Vol Alo Leu 
"~j : T 10 15 

CTT GTC ATT TTC CAG GTA TGC CTG TGT CAA GAT GAG GTC ACG GAC GAT 156 
Leu Vol He Phe Gin Vol Cys Leu Cys Gin Asp Glu Vol Thr Asp Asp 
20 25 30 

TAC ATC GGA GAC AAC ACC ACA GTG GAC TAC ACT TTG TTC GAG TCT TTG 204 
Tyr He Gly Asp Asn Thr Thr Vol Asp Tyr Thr Leu Phe Glu Ser Leu 
SCHDJfffff. 40 45 

TGC TCC AAG AAG GAC GTG GGG AAC TTT AAA GCC TGG TTC CTC CCT ATC 252 
Cys Ser Lys Lys Asp Vol Arg Asn Phe Lys Ala Trp Phe Leu Pro He 
50 55 60 

ATG TAC TCC ATC ATT TGT TTC GTG GGC CTA CTG GGC AAT GGG CTG GTC 300 
Met Tyr Ser He He Cys Phe Vol Gly Leu Leu Gly Asn Gly Leu Vol 
65 70~ 75 

GTG TTG ACC TAT ATC TAT TTC AAG AGG CTC AAG ACC ATG ACC GAT ACC 348 
Vol Leu Thr Tyr He Tyr Phe Lys Arg Leu Lys Thr Met Thr Asp Thr 
15 85 90 95 

TAC CTG CTC AAC CTG GCG GTG GCA GAC ATC CTC TTC CTC CTG ACC CTT 396 
Tyr Leu Leu Asn Leu Alo Vol Alo Asp lie Leu Phe Leu Leu Thr Leu 
100 105 110 

CCC TTC TGG GCC TAC AGC GCG GCC AAG TCC TGG GTC TTC GGT GTC CAC 444 
Pro Phe Trp Alo Tyr Ser Alo Alo Lys Ser Trp Vol Phe Gly Vol His 
115 120 125 

TTT TGC AAG CTC ATC TTT GCC ATC TAC AAG ATG AGC TTC TTC ACT GGC 492 
Phe Cys Lys Leu He Phe Alo lie Tyr Lys Met Ser Phe Phe Ser Gly 
130 135 HO 

ATG CTC CTA CTT CTT TGC ATC AGC ATT GAC CGC TAC GTG GCC ATC GTC 540 
Met Leu Leu Leu Leu Cys He Ser He Asp Arg Tyr Vol Ala lie Vol 

145 150 155 

FIG.1A-1 
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CAG GCT GTC TCA GCT CAC CGC CAC CGT GCC CGC GTC CTT CTC ATC AGC 588 
Gin Alo Vol Ser Ala His Arg His Arg Alo Arg Vol Leu Leu lie Ser 
160 165 170 175 

AAG CTG TCC TGT GTG GGC AGC GCC ATA CTA GCC ACA GTG CTC TCC ATC 636 
lys Leu Ser Cys Vol Gly Ser Alo lie Leu Alo Thr Vol Leu Ser lie 
180" 185 190 

CCA GAG CTC CTG TAC ACT GAC CTC CAG AGG AGC AGC AGT GAG CAA GCG 684 
Pro Glu Leu Leu Tyr Ser Asp Leu Gin Arg Ser Ser Ser Glu Gin Alo 
195 200 205 

ATG CGA TGC TCT CTC ATC ACA GAG CAT GTG GAG GCC TTT ATC ACC ATC 732 
Met Arg Cys Ser Leu He Thr Glu His Vol Glu Alo Phe lie Thr He 
210 215 220 

CAG GTG GCC CAG ATG GTG ATC GGC TTT CTG GTC CCC CTG CTG GCC ATG 780 
Gin Vol Alo Gin Met Vol He Gly Phe Leu Vol Pro Leu Leu Alo Met 
225 230 235 

AGC TTC TGT TAC CTT GTC ATC ATC CGC ACC CTG CTC CAG GCA CGC AAC 828 
Ser Phe Cys Tyr Leu Vol He He Arg Thr Leu Leu Gin Alo Arg Asn 
240 245 250 255 

TTT GAC CGC AAC AAG GCC ATC AAG GTG ATC ATC GCT GTG GTC GTG GTC 876 
Phe Glu Arg Asn Lys Ala lie Lys Vol He lie Alo Vol Vol Vol Vol 
260 265 270 

TTC ATA GTC TTC CAG CTG CCC TAC AAT GGG GTG GTC CTG GCC CAG ACG 924 
Phe H e Vol Phe Gin Leu Pro Tyr Asn Gly Vol Vol Leu Alo Gin Thr 
27T" 280 "~2K 

GTG GCC AAC TTC AAC ATC ACC AGT AGC ACC TGT GAG CTC AGT AAG CAA 972 
Vol Alo Asn Phe Asn lie Thr Ser Ser Thr Cys Glu Leu Ser Lys Gin 
290 00 Iff f f f 295 300 

CTC AAC ATC GCC TAC GAC GTC ACC TAC AX CTG GCC TGC GTC CGC TGC 1020 
Leu Asn He Alo Tyr Asp Vol Thr Tyr Ser Leu Alo Cys Vol Arg Cys 
305 310 315 

FIG.1A-2 
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TGC GTC MC CCT TTC TTG TAC GCC TTC ATC GGC GTC AAG TTC CGC MC 
Cys Vol Asn Pro Phe Leu Tyr Alo Phe He Gly Vol Lys Phe Arg Asn 
320 325 330 335 

GAT ATC TTC AAG CTC TTC AAG GAC CTG GGC TGC CTC AGC CAG GAG CAG 1116 
Asp lie Phe Lys Leu Phe Lys Asp Leu Gly Cys Leu Ser Gin Glu Gin 
340 345 350 

CTC CGG CAG TGG TCT TCC TGT CGG CAC ATC CGG CGC TCC TCC ATG AGT 1164 
Leu Arg Gin Trp Ser Ser Cys Arg His He Arg Arg Ser Ser Met Ser 
355 360 365 

GTG GAG GCC GAG ACC ACC ACC ACC TTC TCC CCA TAGGCGACTC TTCTGCCTGG 1217 
Vol Glu Alo Glu Thr Thr Thr Thr Phe Ser Pro *** 

370 375 

ACTAGAGGGA CCTCTCCCAG GGTCCCTGGG GTGGGGATAG GGAGCAGATG CAATGACTCA 1277 

GGACATCCCC CCGCCAAAAG CTGCTCAGGG GAAAAAGCAG CTCTCCCCTC AGAGTGCAAG 1337 

CCCCTGCTCC AGAAGATAGC TTCACCCCAA TCCCAGCTAC CTCAACCAAT GCCAAAAAAA 1397 

GACAGGGCTG ATAAGCTAAC ACCAGACAGA CAACACTGGG AAACAGAGGC TATTGTCCCC 1457 

TAAACCAAAA ACTGAAAGTG AAAGTCCAGA AACTGTTCCC ACCTGCTGGA GTGAAGGGGC 1517 

CAAGGAGGGT GAGTGCAAGG GGCGTGGGAG TGGCCTGAAG AGTCCTCTGA ATGAACCTTC 1577 

TGGCCTCCCA CAGACTCAAA TGCTCAGACC AGCTCTTCCG AAAACCAGGC CTTATCTCCA 1637 

AGACCAGAGA TAGTGGGGAG ACTTCTTGGC TTGGTGAGGA AAAGCGGACA TCAGCTGGTC 1697 

AAACAAACTC TCTGAACCCC TCCCTCCATC GTTTTCTTCA CTGTCCTCCA AGCCAGCGGG 1757 

AATGGCAGCT GCCACGCCGC CCTAAAAGCA CACTCATCCC CTCACTTGCC GCGTCGCCCT 1817 

CCCAGGCTCT CAACAGGGGA GAGTGTGGTG TTTCCTGCAG GCCAGGCCAG CTGCCTCCGC 1877 

GTGATCAAAG CCACACTCTG GGCTCCAGAG TGGGGATGAC ATGCACTCAG CTCTTGGCTC 1937 

FIG.1A-3 
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CACTGGGATG GGAGGAGAGG ACAAGGGAAA TGTCAGGGGC GGGGAGGGTG ACAGTGGCCG 1997 

CCCAAGGCCA CGAGCTTGTT CTTTGTTCTT TGTCACAGGG ACTGAAAACC TCTCCTCATG 2057 

TTCTGCTTTC GATTCGTTAA GAGAGCAACA TTTTACCCAC ACACAGATAA AGTTTTCCCT 2117 

TGAGGAAACA ACAGCTTTM AAAAAAAAAA GGAATTC 2154 

FIG.1A-4 
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GGAATTCCCT GATATACACC TGGACCACCA CCA ATG GAT ATA CAA ATG GCA AAC 54 
* ** 

Met Asp He Gin Met Alo Asn 
1 5 

AAT TTT ACT CCG CCC TCT GCA ACT CCT CAG GGA AAT GAC TGT GAC CTC 102 
Asn Phe Thr Pro Pro Ser Alo Thr Pro Gin Gly Asn Asp Cys Asp Leu 

CHO §l§ Mi ic 9n 

10 15 20 

TAT GCA CAT CAC AGC. ACG GCC AGG ATA GTA ATG CCT CTG CAT TAC AGC 150 

Tyr Alo His His Ser Thr Alo Arg He Vol Met Pro Leu His Tyr Ser 

25 30 . 35 

CTC GTC TTC ATC ATT GGG CTC GTG GGA AAC TTA CTA GCC TTG GTC GTC 198 

Leu Vol Phe He lie Gly Leu Vol Gly Asn Leu Leu Alo Leu Vol Vol 
40 45 50 55 

ATT GTT CAA AAC AGG AAA AAA ATC AAC TCT ACC ACC CTC TAT TCA ACA 246 

He Vol Gin Asn Arg Lys Lys He Asn Ser Thr Thr Leu Tyr Ser Thr 

60 65 70 

AAT TTG GTG ATT TCT GAT ATA CTT TTT ACC ACG GCT TTG CCT ACA CGA 294 

Asn Leu Vol He Ser Asp He Leu Phe Thr Thr Alo Leu Pro Thr Arg 

75 80 & 

ATA GCC TAC TAT GCA ATG GGC TTT GAC TGG AGA ATC GGA GAT GCC TTG 342 

He Alo Tyr Tyr Alo Met Gly Phe Asp Trp Arg He Gly Asp Alo Leu 

90 95 100 

TGT AGG ATA ACT GOG CTA GTG TTT TAC ATC AAC ACA TAT GCA GGT GTG 390 
Cys Arg He Thr Alo Leu Vol Phe Tyr He Asn Thr Tyr Alo Gly Vol 
105 HO 115 

FIG.1B-1 
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MC TTT ATG ACC TGC CTG AGT ATT GAC CGC TTC ATT GCT GTG GTG CAC 438 
Ac Ph. MM Thr Cvs Leu Ser lie Asp Arg Phe lie Alo Vol Vol His 
120 125 130 135 

CCT CTA CGC TAC AAC AAG ATA AAA AGG ATT GAA CAT GCA AAA GGC GTG 486 

Pro Leu Arg Tyr Asn Lys He Lys Arg lie Glu His Alo Lys Gly_Vol 

14 0 145 150 

TGC ATA TTT GTC TGG ATT CTA GTA TTT GCT CAG ACA CTC CCA CTC CTC 534 

C ys l ie Phe Vol Trp lie Leu Vol Phe Alo G in Thr Leu Pro Leu Leu 

i55" ~160 ' 165 

ATC AAC CCT ATG TCA AAG CAG GAG GCT GAA AGG ATT ACA TGC ATG GAG 582 

He Asn Pro Met Ser Lys Gin Glu Alo Glu Arg lie Thr Cys Met Glu 
- 170 175 1 80 

TAT CCA AAC TTT GAA GAA ACT AAA TCT CTT CCC TGG ATT CTG CTT GGG 630 
Tyr ft, *,„ Pho r.i„ ciu Thr Lvs Ser Leu Pro Trp lie Leu Leu Gly 

185 190 195 

GCA TGT TTC ATA GGA TAT GTA CTT CCA CTT ATA ATC ATT CTC ATC TGC 678 

Alo Cys Ph e He Gly Tyr Vol Leu Pro Leu lie lie He Leu He Cys 

200 1 205 210 215 

TAT TCT CAG ATC TGC TGC AAA CTC TTC AGA ACT GCC AAA CAA AAC CCA 726 

Tyr Ser Gin He Cys Cys Lys Leu Phe Arg Thr Alo Lys Gin Asn Pro 

220 225 230 

CTC ACT GAG AAA TCT GGT GTA AAC AAA AAG GCT CTC AAC ACA ATT ATT 774 

Leu Thr Glu Lys Ser Gly Vol Asn Lys Lys Alo Leu Asn Thr lie He 

235 240 245 

CTT ATT ATT GTT GTG TTT GTT CTC TGT TTC ACA CCT TAC CAT GTT GCA 822 

Leu lie He Vol V ol Phe Vol Leu Cys Phe Thr Pro Tyr His Vol Alo 
250 255 260 

FIG.1B-2 
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ATT ATT CAA CAT ATG ATT AAG AAG CTT CGT TTC TCT AAT TTC CTG GAA 870 

lie He Gin His Met He Lys Lys Leu Arg Phe Ser Asn Phe Leu Glu 

265 270 275 

TGT AGC CAA AGA CAT TCG TTC CAG ATT TCT CTG CAC TTT ACA GTA TGC 918 

Cys Ser Gin Arg His Ser Phe Gin He Ser Leu His Phe Thr Vol Cys 
280 285 290 295 

CTG ATG AAC TTC AAT TGC TGC ATG GAC CCT TTT ATC TAC TTC TTT GCA 966 

Leu Met Asn Phe Asn Cys Cys Met Asp Pro Phe lie Tyr Phe Phe Alo 

300~ 305 310 

TGT AAA GGG TAT AAG AGA AAG GTT ATG AGG ATG CTG AAA CGG CAA GTC 1014 

Cys Lys Gly Tyr Lys Arg Lys Vol Met Arg Met Leu Lys Arg Gin Vol 
— 315 320 325 

ACT GTA TCG ATT TCT AGT GCT GTG AAG TCA GCC CCT GAA GAA AAT TCA 1062 

Ser Vol Ser lie Ser Ser Alo Vol Lys Ser Alo Pro Glu Glu Asn Ser 
330 335 340 

CGT GAA ATG ACA GAA ACG CAG ATG ATG ATA CAT TCC AAG TCT TCA AAT 1110 

Arg Glu Met Thr Glu Thr Gin Met Met lie His Ser Lys Ser Ser Asn 
345 350 355 

GGA AAG TGAAATGGAT TGTATTTTGG TTTATAGTGA CGTAAACTGT ATGACAAACT 1 1 66 

Gly Lys **♦ 
360 

TTGCAGGACT TCCCTTATAA AGCAAAATAA TTGTTCAGCT TCCAATTAGT ATTCTTTTAT 1226 
ATTTCTTTCA TTGGGCGCTT TCCCATCTCC AACTCGGAAG TAAGCCCAAG AGAACAACAT 1286 
AAAGCAAACA ACATAAAGCA CAATAAAAAT GCAAATAAAT ATTTTCATTT TTATTTGTAA 1346 

FIG.1B-3 

SUBSTITUTE SHEET (RULE 26) 



WO 94/12519 PCT/US93/096* 

8/17 



ACGAATACAC CAAAAGGAGG CGCTCTTAAT AACTCCCAAT GTAAAAAGTT TTGTTTTAAT 1406 
AAAAAATTAA TTATTATTCT TGCCAACAAA TGGCTAGAAA GGACTGAATA GATTATATAT 1466 
TGCCAGATGT TAATACTGTA ACATACTTTT TAAATAACAT ATTTCTTAAA TCCAAATTTC 1526 
TCTCMTGTT AGATTTAATT CCCTCAATAA CACCAATGTT TTGTTTTGTT TCGTTCTGGG 1586 
TCATAAAACT TTGTTAAGGA ACTCTTTTGG AA7AAAGAGC AGGATGCTGC GGAATTC 1643 

FIG.1B-4 
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GAATTCCGCA GCC ATG ACC CCG CAG CTT CTC CTG GCC CTT GTC CTC TGG 49 

Me t Thr Pro Gin Leu Leu Leu Alo leu Vol Leu Trp 
T~ 5 io 

GCC AGC TGC CCG CCC TGC AGT GGA AGG AAA GGG CCC CCA GCA GCT CTG 97 

Alo Ser Cys Pro Pro Cys Ser Gly Arg Lys Gly Pro Pro Alo Alo Leu 

15" 20 25 

ACA CTG CCC CGG GTG CAA TGC CGA GCC TCT CGG TAC CCG ATC GCC GTG 145 

Thr Leu Pro Arg Vol Gin Cys Arg Alo Ser Arg Tyr Pro lie Alo Vol 
30 35 40 

GAT TGC TCC TGG ACC CTG CCG CCT GCT CCA AAC TCC ACC AGC CCC GGT 193 

Asp Cys Ser Trp Thr Leu Pro Pro Alo Pro Asn Ser Thr Ser Pro Gty 
45 50 55 

GTC CGT GGA TTG CGA CGT ACA GGC TCG GCA TGG CTG CCC GGG GCC ACA 241 

Vol Arg Gly Leu Arg Arg Thr Gly Ser Alo Trp Leu Pro Gly Alo Thr 
65 70 75 

GCG TGG CCC TGC CTG CAG CAG ACG CCA ACG TCC ACC AGC TGC ACC ATC 289 

Alo Trp Pro Cys Leu Gin Gin Thr Pro Thr Ser Thr Ser Cys Thr He 
80 85 90 

ACG GAT GTC CAG CTG TTC TCC ATG GCT CCC TAC GTG CTC AAT GTC ACC 337 

GCC GTC CAC CCC TGG GGC TCC AGC AGC AGC TTC GTG CCT TTC ATA ACA 385 

Alo Vol His Pro Trp Gly Ser Ser Ser Ser Phe Vol Pro Phe lie Thr 
110 115 120 

GAG CAC ATC ATC AAG CCC GAC CCT CCA GAA GGC GTG CGC CTA AGC CCC 433 

Glu His He He Lys Pro Asp Pro Pro Glu Gly Vol Arg Leu Ser Pro 
125 130 135 14U 

FIG.5A 
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CTC GCT GAG CGC CAC GTA CAG GTG CAG TGG GAG CCT CCC GGG TCC TGG 481 

Leu Ala Glu Arg His Vol Gin Vol Gin Trp Glu Pro Pro Gly Ser Trp 
145 150 155 

CCC TTC CCA GAG ATC TTC TCA CTG AAG TAC TGG ATC CGT TAC AAG CGT 529 

Pro Phe Pro Glu He Phe Ser Leu Lys Tyr Trp He Arg Tyr Lys Arg 
160 165 ,7a 

CAG GGA GCT GCG CGC TTC CAC CGG GTG GGG CCC ATT GAA GCC ACG TCC 577 

Gin Gly Ala Alo Arg Phe His Arg Vol Gly Pro He Glu Ala Thr Ser 
175 180 185 

TTC ATC CTC AGG GCT GTG CGG CCC CGA GCC AGG TAC TAC GTC CAA GTG 625 

Phe He Leu Arg Ala Vol Arg Pro Arg Alo Arg Tyr Tyr Vol Gin Vol 
190 195 



GCG GCT CAG GAC CTC ACA GAC TAC GGG GAA CTG ACT GAC TGG ACT CTC 673 

Alo Alo Gin Asp Leu Thr Asp Tyr Gly Glu Leu Ser Asp Trp Ser Leu 
205 210 215 220 

CCC GCC ACT GCC ACA ATG AGC CTG GGC AAG TAGCAAGGGC TTCCCGCTGC 723 

Pro Alo Thr Alo Thr Met Ser Leu Gly Lys 
225 230 

CTCCAGACAG CACCTGGGTC CTCGCCACCC TAAGCCCCGG GACACCTGTT GGAGGGCGGA 783 

TGGGATCTGC CTAGCCTGGG CTGGAGTCCT TGCTTTGCTG CTGCTGAGCT GCCGGGCAAC 843 

CTCAGATGAC CGACTTTTCC CTTTGAGCCT CAGTTTCTCT AGCTGAGAAA TGGAGATGTA 903 

CTACTCTCTC CTTTACCTTT ACCTTTACCA CAGTGCAGGG CTGACTGAAC TGTCACTGTG 963 

FIG.5B 
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AGATATTTTT TATTGTTTAA TTAGAAAAGA ATTGTTGT TG GGCTGGGCGC AGTGGATCGC 1023 
ACCTGTAATC CCAGTCACTG GGAAGCCGAC GTGGGTGGGT AGCTTGAGGC CAGGAGCTCG 1083 
m rr^rr umirjaiOMMX ATCTCTAAAA AATTAATATA AATATAAAAT 1143 
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